skip to main content
10.1145/1007568.1007642acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

Robust query processing through progressive optimization

Published:13 June 2004Publication History

ABSTRACT

Virtually every commercial query optimizer chooses the best plan for a query using a cost model that relies heavily on accurate cardinality estimation. Cardinality estimation errors can occur due to the use of inaccurate statistics, invalid assumptions about attribute independence, parameter markers, and so on. Cardinality estimation errors may cause the optimizer to choose a sub-optimal plan. We present an approach to query processing that is extremely robust because it is able to detect and recover from cardinality estimation errors. We call this approach "progressive query optimization" (POP). POP validates cardinality estimates against actual values as measured during query execution. If there is significant disagreement between estimated and actual values, execution might be stopped and re-optimization might occur. Oscillation between optimization and execution steps can occur any number of times. A re-optimization step can exploit both the actual cardinality and partial results, computed during a previous execution step. Checkpoint operators (CHECK) validate the optimizer's cardinality estimates against actual cardinalities. Each CHECK has a condition that indicates the cardinality bounds within which a plan is valid. We compute this validity range through a novel sensitivity analysis of query plan operators. If the CHECK condition is violated, CHECK triggers re-optimization. POP has been prototyped in a leading commercial DBMS. An experimental evaluation of POP using TPC-H queries illustrates the robustness POP adds to query processing, while incurring only negligible overhead. A case-study applying POP to a real-world database and workload shows the potential of POP, accelerating complex OLAP queries by almost two orders of magnitude.

References

  1. AH00 R. Avnur and J. M. Hellerstein, Eddies: Continuously Adaptive Query Optimization, SIGMOD 2000 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. ARM89 R. Ahad, K. V. B. Rao, and D. McLeod, On Estimating the Cardinality of the Projection of a Database Relation, TODS 14(1), 1989 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. BC02 N. Bruno and S. Chaudhuri. Exploiting Statistics on Query Expressions for Optimization, SIGMOD 2002 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. C+81 D. D. Chamberlin et al., Support for Repetitive Transactions and Ad-Hoc Query in System R, TODS 6(1), 1981. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. CG94 R. Cole and G. Graefe. Optimization of Dynamic query evaluation plans, SIGMOD 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Ge193 A. Van Gelder, Multiple Join Size Estimation by Virtual Domains, PODS 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. GM93 G. Graefe and W. McKenna, The Volcano Optimizer Generator: Extensibility and Efficient Search. ICDE, 1993 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. GW89 G. Graefe and K. Ward, Dynamic Query Evaluation Plans. SIGMOD 1989 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. HS93 P. Haas and A. Swami, Sampling-Based Selectivity Estimation for Joins Using Augmented Frequent Value Statistics, IBM Research Report, 1993Google ScholarGoogle Scholar
  10. HS02 A. Hulgeri and S. Sudarshan. Parametric Query Optimization for Linear and Piecewise Linear Cost Functions. VLDB 2002.Google ScholarGoogle Scholar
  11. IC91 Y.E. Ioannidis and S. Christodoulakis. Propagation of Errors in the Size of Join Results, SIGMOD 1991 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Ives 02 Z. Ives, Efficient Query Processing for Data Integration, Ph.D thesis, University of Washington, 2002 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. KD98 N. Kabra and D. DeWitt, Efficient Mid-Query Re-Optimization of Sub-Optimal Query Execution Plans, SIGMOD 1998 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. MS02 S. Madden, M. Shah, J. M. Hellerstein, V. Raman. Continuously Adaptive Continuous Queries. SIGMOD 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. PIH+96 V. Poosala, et. al, Improved histograms for selectivity estimation of range predicates, SIGMOD 1996 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. P197 V. Poosala and Y. Ioannidis, Selectivity Estimation without value independence, VLDB 1997 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. RAH03 V. Raman, A. Deshpande, and J. M. Hellerstein, Using State Modules for Adaptive Query Optimization. ICDE 2003Google ScholarGoogle Scholar
  18. SAC+79 P.G. Selinger et al. Access Path Selection in a Relational DBMS. SIGMOD 1979 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. SS94 A. N. Swami and K. B. Schiefer, On the Estimation of Join Result Sizes, EDBT 1994 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. SWK96 M. Stonebraker, E. Wong and P. Kreps. The Design and Implementation of INGRES. TODS 1(3), 1976. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. UFA98 T. Urhan, M. J. Franklin, and L. Amsaleg, Cost-based Query Scrambling for Initial Delays, SIGMOD 1998 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. SLM+01 M. Stillger, G. Lohman, V. Markl, and M. Kandil. LEO --- DB2's Learning Optimizer, VLDB 2001 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. ZCL+00 M. Zaharioudakis et. al: Answering Complex SQL Queries Using ASTs. SIGMOD 2000 Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Robust query processing through progressive optimization

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGMOD '04: Proceedings of the 2004 ACM SIGMOD international conference on Management of data
        June 2004
        988 pages
        ISBN:1581138598
        DOI:10.1145/1007568

        Copyright © 2004 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 13 June 2004

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate785of4,003submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader