research-article

How good are query optimizers, really?

Authors:
Viktor Leis

TUM

TUM
View Profile

,
Andrey Gubichev

TUM

TUM
View Profile

,
Atanas Mirchev

TUM

TUM
View Profile

,
Peter Boncz

CWI

CWI
View Profile

,
Alfons Kemper

TUM

TUM
View Profile

,
Thomas Neumann

TUM

TUM
View Profile

Proceedings of the VLDB Endowment Volume 9 Issue 3pp 204–215https://doi.org/10.14778/2850583.2850594

Published:01 November 2015Publication History

Proceedings of the VLDB Endowment

Abstract

Finding a good join order is crucial for query performance. In this paper, we introduce the Join Order Benchmark (JOB) and experimentally revisit the main components in the classic query optimizer architecture using a complex, real-world data set and realistic multi-join queries. We investigate the quality of industrial-strength cardinality estimators and find that all estimators routinely produce large errors. We further show that while estimates are essential for finding a good join order, query performance is unsatisfactory if the query engine relies too heavily on these estimates. Using another set of experiments that measure the impact of the cost model, we find that it has much less influence on query performance than the cardinality estimates. Finally, we investigate plan enumeration techniques comparing exhaustive dynamic programming with heuristic algorithms and find that exhaustive enumeration improves performance despite the sub-optimal cardinality estimates.

References

R. Ahmed, R. Sen, M. Poess, and S. Chakkappen. Of snowstorms and bushy trees. PVLDB, 7(13):1452--1461, 2014. Google ScholarDigital Library
B. Babcock and S. Chaudhuri. Towards a robust query optimizer: A principled and practical approach. In SIGMOD, pages 119--130, 2005. Google ScholarDigital Library
S. Bellamkonda, H.-G. Li, U. Jagtap, Y. Zhu, V. Liang, and T. Cruanes. Adaptive and big data scale parallel execution in Oracle. PVLDB, 6(11):1102--1113, 2013. Google ScholarDigital Library
R. Borovica-Gajic, S. Idreos, A. Ailamaki, M. Zukowski, and C. Fraser. Smooth scan: Statistics-oblivious access paths. In ICDE, pages 315--326, 2015.Google ScholarCross Ref
N. Bruno, C. A. Galindo-Legaria, and M. Joshi. Polynomial heuristics for query optimization. In ICDE, pages 589--600, 2010.Google ScholarCross Ref
S. Chaudhuri. Query optimizers: time to rethink the contract? In SIGMOD, pages 961--968, 2009. Google ScholarDigital Library
S. Chaudhuri, V. R. Narasayya, and R. Ramamurthy. Exact cardinality query optimization for optimizer testing. PVLDB, 2(1):994--1005, 2009. Google ScholarDigital Library
M. Colgan. Oracle adaptive joins. https://blogs.oracle.com/optimizer/entry/what_s_new_in_12c, 2013.Google Scholar
A. Dutt and J. R. Haritsa. Plan bouquets: query processing without selectivity estimation. In SIGMOD, pages 1039--1050, 2014. Google ScholarDigital Library
C. Estan and J. F. Naughton. End-biased samples for join cardinality estimation. In ICDE, page 20, 2006. Google ScholarDigital Library
L. Fegaras. A new heuristic for optimizing large queries. In DEXA, pages 726--735, 1998. Google ScholarDigital Library
P. Fender and G. Moerkotte. Counter strike: Generic top-down join enumeration for hypergraphs. PVLDB, 6(14):1822--1833, 2013. Google ScholarDigital Library
P. Fender, G. Moerkotte, T. Neumann, and V. Leis. Effective and robust pruning for top-down join enumeration algorithms. In ICDE, pages 414--425, 2012. Google ScholarDigital Library
C. Fraser, L. Giakoumakis, V. Hamine, and K. F. Moore-Smith. Testing cardinality estimation models in SQL Server. In DBtest, 2012. Google ScholarDigital Library
G. Graefe. A generalized join algorithm. In BTW, pages 267--286, 2011.Google Scholar
Z. Gu, M. A. Soliman, and F. M. Waas. Testing the accuracy of query optimizers. In DBTest, 2012. Google ScholarDigital Library
P. J. Haas, J. F. Naughton, S. Seshadri, and A. N. Swami. Selectivity and cost estimation for joins based on random sampling. J Computer System Science, 52(3):550--569, 1996. Google ScholarDigital Library
J. R. Haritsa. The Picasso database query optimizer visualizer. PVLDB, 3(2):1517--1520, 2010. Google ScholarDigital Library
I. F. Ilyas, V. Markl, P. J. Haas, P. Brown, and A. Aboulnaga. CORDS: automatic discovery of correlations and soft functional dependencies. In SIGMOD, pages 647--658, 2004. Google ScholarDigital Library
Y. E. Ioannidis. The history of histograms (abridged). In VLDB, pages 19--30, 2003. Google ScholarDigital Library
Y. E. Ioannidis and S. Christodoulakis. On the propagation of errors in the size of join results. In SIGMOD, 1991. Google ScholarDigital Library
R. A. Kader, P. A. Boncz, S. Manegold, and M. van Keulen. ROX: run-time optimization of XQueries. In SIGMOD, pages 615--626, 2009. Google ScholarDigital Library
R. Kaushik, C. Ré, and D. Suciu. General database statistics using entropy maximization. In DBPL, pages 84--99, 2009. Google ScholarDigital Library
Q. Li, M. Shao, V. Markl, K. S. Beyer, L. S. Colby, and G. M. Lohman. Adaptively reordering joins during query execution. In ICDE, pages 26--35, 2007.Google ScholarCross Ref
F. Liu and S. Blanas. Forecasting the cost of processing multi-join queries via hashing for main-memory databases. In SoCC, pages 153--166, 2015. Google ScholarDigital Library
G. Lohman. Is query optimization a solved problem? http://wp.sigmod.org/?p=1075, 2014.Google Scholar
L. F. Mackert and G. M. Lohman. R* optimizer validation and performance evaluation for local queries. In SIGMOD, pages 84--95, 1986. Google ScholarDigital Library
V. Markl, N. Megiddo, M. Kutsch, T. M. Tran, P. J. Haas, and U. Srivastava. Consistently estimating the selectivity of conjuncts of predicates. In VLDB, pages 373--384, 2005. Google ScholarDigital Library
G. Moerkotte and T. Neumann. Dynamic programming strikes back. In SIGMOD, pages 539--552, 2008. Google ScholarDigital Library
G. Moerkotte, T. Neumann, and G. Steidl. Preventing bad plans by bounding the impact of cardinality estimation errors. PVLDB, 2(1):982--993, 2009. Google ScholarDigital Library
I. Müller, P. Sanders, A. Lacurie, W. Lehner, and F. Färber. Cache-efficient aggregation: Hashing is sorting. In SIGMOD, pages 1123--1136, 2015. Google ScholarDigital Library
T. Neumann. Query simplification: graceful degradation for join-order optimization. In SIGMOD, pages 403--414, 2009. Google ScholarDigital Library
T. Neumann and C. A. Galindo-Legaria. Taking the edge off cardinality estimation errors using incremental execution. In BTW, pages 73--92, 2013.Google Scholar
V. Poosala and Y. E. Ioannidis. Selectivity estimation without the attribute value independence assumption. In VLDB, pages 486--495, 1997. Google ScholarDigital Library
F. Rusu and A. Dobra. Sketches for size of join estimation. TODS, 33(3), 2008. Google ScholarDigital Library
P. G. Selinger, M. M. Astrahan, D. D. Chamberlin, R. A. Lorie, and T. G. Price. Access path selection in a relational database management system. In SIGMOD, pages 23--34, 1979. Google ScholarDigital Library
M. Steinbrunn, G. Moerkotte, and A. Kemper. Heuristic and randomized optimization for the join ordering problem. VLDB J., 6(3):191--208, 1997. Google ScholarDigital Library
M. Stillger, G. M. Lohman, V. Markl, and M. Kandil. LEO - DB2's learning optimizer. In VLDB, pages 19--28, 2001. Google ScholarDigital Library
K. Tzoumas, A. Deshpande, and C. S. Jensen. Lightweight graphical models for selectivity estimation without independence assumptions. PVLDB, 4(11):852--863, 2011.Google ScholarDigital Library
F. Waas and A. Pellenkoft. Join order selection - good enough is easy. In BNCOD, pages 51--67, 2000. Google ScholarDigital Library
F. M. Waas, L. Giakoumakis, and S. Zhang. Plan space analysis: an early warning system to detect plan regressions in cost-based optimizers. In DBTest, 2011. Google ScholarDigital Library
W. Wu, Y. Chi, S. Zhu, J. Tatemura, H. Hacigümüs, and J. F. Naughton. Predicting query execution time: Are optimizer cost models really unusable? In ICDE, pages 1081--1092, 2013. Google ScholarDigital Library
F. Yu, W. Hou, C. Luo, D. Che, and M. Zhu. CS2: a new database synopsis for query estimation. In SIGMOD, pages 469--480, 2013. Google ScholarDigital Library

Index Terms

How good are query optimizers, really?
1. Information systems
  1. Data management systems
    1. Database management system engines
      1. Database query processing
    2. Query languages
      1. Relational database query languages
  2. Information retrieval
    1. Information retrieval query processing
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Database theory
      1. Database query processing and optimization (theory)

Index terms have been assigned to the content through auto-classification.

Recommendations

Building query optimizers with combinators
Read More
View-based query containment
PODS '03: Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems

Query containment is the problem of checking whether for all databases the answer to a query is a subset of the answer to a second query. In several data management tasks, such as data integration, mobile computing, etc., the data of interest are only ...
Read More
Query Folding
ICDE '96: Proceedings of the Twelfth International Conference on Data Engineering

Query folding refers to the activity of determining if and how a query can be answered using a given set of resources, which might be materialized views, cached results of previous queries, or queries answerable by other databases. We investigate query ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

Proceedings of the VLDB Endowment Volume 9, Issue 3
November 2015
144 pages
ISSN:2150-8097
Issue’s Table of Contents
Sponsors
In-Cooperation
Publisher
VLDB Endowment
Publication History
- Published: 1 November 2015
Published in pvldb Volume 9, Issue 3
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 250
  Total Citations
  View Citations
- 1,699
  Total Downloads
- Downloads (Last 12 months)471
- Downloads (Last 6 weeks)67
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

How good are query optimizers, really?

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Index Terms

Recommendations

Building query optimizers with combinators

View-based query containment

Query Folding

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

How good are query optimizers, really?

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Index Terms

Recommendations

Building query optimizers with combinators

View-based query containment

Query Folding

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media