skip to main content
research-article

How good are query optimizers, really?

Published:01 November 2015Publication History
Skip Abstract Section

Abstract

Finding a good join order is crucial for query performance. In this paper, we introduce the Join Order Benchmark (JOB) and experimentally revisit the main components in the classic query optimizer architecture using a complex, real-world data set and realistic multi-join queries. We investigate the quality of industrial-strength cardinality estimators and find that all estimators routinely produce large errors. We further show that while estimates are essential for finding a good join order, query performance is unsatisfactory if the query engine relies too heavily on these estimates. Using another set of experiments that measure the impact of the cost model, we find that it has much less influence on query performance than the cardinality estimates. Finally, we investigate plan enumeration techniques comparing exhaustive dynamic programming with heuristic algorithms and find that exhaustive enumeration improves performance despite the sub-optimal cardinality estimates.

References

  1. R. Ahmed, R. Sen, M. Poess, and S. Chakkappen. Of snowstorms and bushy trees. PVLDB, 7(13):1452--1461, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. B. Babcock and S. Chaudhuri. Towards a robust query optimizer: A principled and practical approach. In SIGMOD, pages 119--130, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Bellamkonda, H.-G. Li, U. Jagtap, Y. Zhu, V. Liang, and T. Cruanes. Adaptive and big data scale parallel execution in Oracle. PVLDB, 6(11):1102--1113, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. Borovica-Gajic, S. Idreos, A. Ailamaki, M. Zukowski, and C. Fraser. Smooth scan: Statistics-oblivious access paths. In ICDE, pages 315--326, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  5. N. Bruno, C. A. Galindo-Legaria, and M. Joshi. Polynomial heuristics for query optimization. In ICDE, pages 589--600, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  6. S. Chaudhuri. Query optimizers: time to rethink the contract? In SIGMOD, pages 961--968, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Chaudhuri, V. R. Narasayya, and R. Ramamurthy. Exact cardinality query optimization for optimizer testing. PVLDB, 2(1):994--1005, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Colgan. Oracle adaptive joins. https://blogs.oracle.com/optimizer/entry/what_s_new_in_12c, 2013.Google ScholarGoogle Scholar
  9. A. Dutt and J. R. Haritsa. Plan bouquets: query processing without selectivity estimation. In SIGMOD, pages 1039--1050, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C. Estan and J. F. Naughton. End-biased samples for join cardinality estimation. In ICDE, page 20, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. L. Fegaras. A new heuristic for optimizing large queries. In DEXA, pages 726--735, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. P. Fender and G. Moerkotte. Counter strike: Generic top-down join enumeration for hypergraphs. PVLDB, 6(14):1822--1833, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. P. Fender, G. Moerkotte, T. Neumann, and V. Leis. Effective and robust pruning for top-down join enumeration algorithms. In ICDE, pages 414--425, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. C. Fraser, L. Giakoumakis, V. Hamine, and K. F. Moore-Smith. Testing cardinality estimation models in SQL Server. In DBtest, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. G. Graefe. A generalized join algorithm. In BTW, pages 267--286, 2011.Google ScholarGoogle Scholar
  16. Z. Gu, M. A. Soliman, and F. M. Waas. Testing the accuracy of query optimizers. In DBTest, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. P. J. Haas, J. F. Naughton, S. Seshadri, and A. N. Swami. Selectivity and cost estimation for joins based on random sampling. J Computer System Science, 52(3):550--569, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. R. Haritsa. The Picasso database query optimizer visualizer. PVLDB, 3(2):1517--1520, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. I. F. Ilyas, V. Markl, P. J. Haas, P. Brown, and A. Aboulnaga. CORDS: automatic discovery of correlations and soft functional dependencies. In SIGMOD, pages 647--658, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Y. E. Ioannidis. The history of histograms (abridged). In VLDB, pages 19--30, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Y. E. Ioannidis and S. Christodoulakis. On the propagation of errors in the size of join results. In SIGMOD, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. A. Kader, P. A. Boncz, S. Manegold, and M. van Keulen. ROX: run-time optimization of XQueries. In SIGMOD, pages 615--626, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. R. Kaushik, C. Ré, and D. Suciu. General database statistics using entropy maximization. In DBPL, pages 84--99, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Q. Li, M. Shao, V. Markl, K. S. Beyer, L. S. Colby, and G. M. Lohman. Adaptively reordering joins during query execution. In ICDE, pages 26--35, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  25. F. Liu and S. Blanas. Forecasting the cost of processing multi-join queries via hashing for main-memory databases. In SoCC, pages 153--166, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. G. Lohman. Is query optimization a solved problem? http://wp.sigmod.org/?p=1075, 2014.Google ScholarGoogle Scholar
  27. L. F. Mackert and G. M. Lohman. R* optimizer validation and performance evaluation for local queries. In SIGMOD, pages 84--95, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. V. Markl, N. Megiddo, M. Kutsch, T. M. Tran, P. J. Haas, and U. Srivastava. Consistently estimating the selectivity of conjuncts of predicates. In VLDB, pages 373--384, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. G. Moerkotte and T. Neumann. Dynamic programming strikes back. In SIGMOD, pages 539--552, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. G. Moerkotte, T. Neumann, and G. Steidl. Preventing bad plans by bounding the impact of cardinality estimation errors. PVLDB, 2(1):982--993, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. I. Müller, P. Sanders, A. Lacurie, W. Lehner, and F. Färber. Cache-efficient aggregation: Hashing is sorting. In SIGMOD, pages 1123--1136, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. T. Neumann. Query simplification: graceful degradation for join-order optimization. In SIGMOD, pages 403--414, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. T. Neumann and C. A. Galindo-Legaria. Taking the edge off cardinality estimation errors using incremental execution. In BTW, pages 73--92, 2013.Google ScholarGoogle Scholar
  34. V. Poosala and Y. E. Ioannidis. Selectivity estimation without the attribute value independence assumption. In VLDB, pages 486--495, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. F. Rusu and A. Dobra. Sketches for size of join estimation. TODS, 33(3), 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. P. G. Selinger, M. M. Astrahan, D. D. Chamberlin, R. A. Lorie, and T. G. Price. Access path selection in a relational database management system. In SIGMOD, pages 23--34, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. M. Steinbrunn, G. Moerkotte, and A. Kemper. Heuristic and randomized optimization for the join ordering problem. VLDB J., 6(3):191--208, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. M. Stillger, G. M. Lohman, V. Markl, and M. Kandil. LEO - DB2's learning optimizer. In VLDB, pages 19--28, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. K. Tzoumas, A. Deshpande, and C. S. Jensen. Lightweight graphical models for selectivity estimation without independence assumptions. PVLDB, 4(11):852--863, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. F. Waas and A. Pellenkoft. Join order selection - good enough is easy. In BNCOD, pages 51--67, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. F. M. Waas, L. Giakoumakis, and S. Zhang. Plan space analysis: an early warning system to detect plan regressions in cost-based optimizers. In DBTest, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. W. Wu, Y. Chi, S. Zhu, J. Tatemura, H. Hacigümüs, and J. F. Naughton. Predicting query execution time: Are optimizer cost models really unusable? In ICDE, pages 1081--1092, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. F. Yu, W. Hou, C. Luo, D. Che, and M. Zhu. CS2: a new database synopsis for query estimation. In SIGMOD, pages 469--480, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. How good are query optimizers, really?
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image Proceedings of the VLDB Endowment
            Proceedings of the VLDB Endowment  Volume 9, Issue 3
            November 2015
            144 pages

            Publisher

            VLDB Endowment

            Publication History

            • Published: 1 November 2015
            Published in pvldb Volume 9, Issue 3

            Qualifiers

            • research-article

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader