skip to main content
article
Free Access

Iterative dynamic programming: a new class of query optimization algorithms

Published:01 March 2000Publication History
Skip Abstract Section

Abstract

The query optimizer is one of the most important components of a database system. Most commercial query optimizers today are based on a dynamic-programming algorithm, as proposed in Selinger et al. [1979]. While this algorithm produces good optimization results (i.e, good plans), its high complexity can be prohibitive if complex queries need to be processed, new query execution techniques need to be integrated, or in certain programming environments (e.g., distributed database systems). In this paper, we present and thoroughly evaluate a new class of query optimization algorithms that are based on a principle that we call iterative dynamic programming, or IDP for short. IDP has several important advantages: First, IDP-algorithms produce the best plans of all known algorithms in situations in which dynamic programming is not viable because of its high complexity. Second, some IDP variants are adaptive and produce as good plans as dynamic programming if dynamic programming is viable and as good-as possible plans if dynamic programming turns out to be not viable. Three, all IDP-algorithms can very easily be integrated into an existing optimizer which is based on dynamic programming.

References

  1. BUCK-EMDEN, R. AND GALIMOW, J. 1996. SAP R/3 System, A Client ~Server Technology. Addison-Wesley, Reading, MA. Google ScholarGoogle Scholar
  2. CAREY, M. AND KOSSMANN, D. 1997. On saying "enough already!" in SQL. In Proceedings of the International ACM Conference on Management of Data (SIGMOD '97, Tucson, AZ, May), ACM, New York, NY, 219-230. Google ScholarGoogle Scholar
  3. CHAUDHURI, S. AND SHIM, K. 1994. Including group-by in query optimization. In Proceedings of the 20th International Conference on Very Large Data Bases (VLDB'94, Santiago, Chile, Sept.), VLDB Endowment, Berkeley, CA, 354-366. Google ScholarGoogle Scholar
  4. CHAUDHURI, S. AND SHIM, K. 1996. Optimization of queries with user-defined predicates. In Proceedings of the 22nd International Conference on Very Large Data Bases (VLDB '96, Bombay, India, Sept.), 87-98. Google ScholarGoogle Scholar
  5. DOPPELHAMMER, g., H PPLER, T., KEMPER, n., AND KOSSMANN, D. 1997. Database performance in the real world: TPC-D and SAP R/3. In Proceedings of the ACM SIGMOD Conference on Management of Data (ACM SIGMOD, Tucson, AZ, May), ACM Press, New York, NY, 123-134. Google ScholarGoogle Scholar
  6. FRANKLIN, M., J NSSON, B., AND KOSSMANN, D. 1996. Performance tradeoffs for client-server query processing. In Proceedings of the ACM-SIGMOD Conference on Management of Data (Montreal, Canada, June), ACM, New York, NY, 149-160. Google ScholarGoogle Scholar
  7. GALINDO-LEGARIA, C., PELLENKOFT, n., AND KERSTEN, M. 1994. Fast, randomized join-order selection--why use transformations?. In Proceedings of the 20th International Conference on Very Large Data Bases (VLDB'94, Santiago, Chile, Sept.), VLDB Endowment, Berkeley, CA, 85-95. Google ScholarGoogle Scholar
  8. GANGULY, S., HASAN, W., AND KRISHNAMURTHY, R. 1992. Query optimization for parallel execution. In Proceedings of the ACM-SIGMOD Conference on Management of Data (ACM SIGMOD, San Diego, CA, June), ACM, New York, NY, 9-18. Google ScholarGoogle Scholar
  9. GASSNER, P., LOHMAN, G., SCHIEFER, K., AND WANG, Y. 1993. Query optimization in the IBM DB2 family. IEEE Data Eng. Tech. Bull. 16, 3 (Sept.), 4-18.Google ScholarGoogle Scholar
  10. GRAEFE, G. AND DEWITT, D. 1987. The EXODUS optimizer generator. In Proceedings of the ACM-SIGMOD Conference on Management of Data (San Francisco, CA, May), ACM, New York, NY, 160-172. Google ScholarGoogle Scholar
  11. GRAEFE, G. AND MCKENNA, W.J. 1993. The volcano optimizer generator: Extensibility and efficient search. In Proceedings of the 9th International Conference on Data Engineering (Vienna, Austria, Apr.), IEEE Computer Society, Washington, DC, 209-218. Google ScholarGoogle Scholar
  12. HAAS, L., KOSSMANN, D., WIMMERS, E., AND YANG, J. 1997. Optimizing queries across diverse data sources. In Proceedings of the 23rd International Conference on Very Large Data Bases (VLDB '97, Athens, Greece, Aug.), 276-285. Google ScholarGoogle Scholar
  13. HELLERSTEIN, J. M. 1994. Practical predicate placement. In Proceedings of the 1994 ACM SIGMOD International Conference on Management of Data (SIGMOD '94, Minneapolis, MN, May 24-27), R. T. Snodgrass and M. Winslett, Eds. ACM Press, New York, NY, 325-335. Google ScholarGoogle Scholar
  14. HELLERSTEIN, J. M. AND STONEBRAKER, M. 1993. Predicate migration: Optimizing queries with expensive predicates. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data (SIGMOD '93, Washington, DC, May 26-28), P. Buneman and S. Jajodia, Eds. ACM Press, New York, NY, 267-276. Google ScholarGoogle Scholar
  15. HONG, W. AND STONEBRAKER, M. 1991. Optimization of parallel execution plans in xprs. In Proceedings of the First International Conference on Parallel and Distributed Information Systems (Miami Beach, FL), 218-225. Google ScholarGoogle Scholar
  16. IBARAKI, T. AND KAMEDA, T. 1984. On the optimal nesting order for computing N-relational joins. ACM Trans. Database Syst. 9, 3 (Sept. 1984), 482-502. Google ScholarGoogle Scholar
  17. IOANNIDIS, Y. E. AND KANG, Y. C. 1990. Randomized algorithms for optimizing large join queries. In Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data (SIGMOD '90, Atlantic City, NJ, May 23-25), H. Garcia-Molina, Ed. ACM Press, New York, NY, 312-321. Google ScholarGoogle Scholar
  18. IOANNIDIS, Y. AND WONG, E. 1987. Query optimization by simulated annealing. In Proceedings of the ACM-SIGMOD Conference on Management of Data (San Francisco, CA, May), ACM, New York, NY, 9-22. Google ScholarGoogle Scholar
  19. KEMPER, A., KOSSMANN, D., AND MATTHES, F. 1998. SAP R/3 (tutorial): a database application system. SIGMOD Rec. 27, 2, 499. http://www.db.fmi.uni-passau.de/publications/tutorials Google ScholarGoogle Scholar
  20. KEMPER, A., MOERKOTTE, G., AND PEITHNER, K. 1993. A blackboard architecture for query optimization in object bases. In Proceedings of the Conference on Very Large Data Bases (VLDB '93, Dublin, Ireland, Aug.), Morgan Kaufmann Publishers Inc., San Francisco, CA, 543-554. Google ScholarGoogle Scholar
  21. KOSSMANN, D. 1998. The state of the art in distributed query processing. Submitted for publication.Google ScholarGoogle Scholar
  22. KRISHNAMURTHY, R., BORAL, H., AND ZANIOLO, C. 1986. Optimization of nonrecursive queries. In Proceedings of the 12th International Conference on Very Large Data Bases (Kyoto, Japan, Aug.), VLDB Endowment, Berkeley, CA, 128-137. Google ScholarGoogle Scholar
  23. LANZELOTTE, R., VALDURIEZ, P., AND ZAIT, M. 1993. On the effectiveness of optimization search strategies for parallel execution spaces. In Proceedings of the Conference on Very Large Data Bases (VLDB '93, Dublin, Ireland, Aug.), Morgan Kaufmann Publishers Inc., San Francisco, CA, 493-504. Google ScholarGoogle Scholar
  24. LI, C., YERNENI, R., VASSALOS, V., GARCIA-MOLINA, H., PAPAKONSTANTINOU, Y., ULLMAN, J. D., AND VALIVETI, M. 1998. Capability based mediation in TSIMMIS (demonstration). In Proceedings of the ACM SIGMOD Conference on Management of Data (SIGMOD, Seattle, WA, June), ACM Press, New York, NY, 564-566. Google ScholarGoogle Scholar
  25. LOHMAN, G. 1988. Grammar-like functional rules for representing query optimization alternatives. In Proceedings of the ACM SIGMOD Conference on Management of Data (SIGMOD '88, Chicago, IL, June 1-3), H. Boral and P.-A. Larson, Eds. ACM Press, New York, NY, 18-27. Google ScholarGoogle Scholar
  26. MACKERT, L. F. AND LOHMAN, G. M. 1986. R* optimizer validation and performance evaluation for distributed queries. In Proceedings of the 12th International Conference on Very Large Data Bases (Kyoto, Japan, Aug.), VLDB Endowment, Berkeley, CA, 149-159. Google ScholarGoogle Scholar
  27. ONO, K. AND LOHMAN, G. 1990. Measuring the complexity of join enumeration in query optimization. In Proceedings of the 16th International Conference on Very Large Data Bases (VLDB, Brisbane, Australia, Aug.), VLDB Endowment, Berkeley, CA, 314-325. Google ScholarGoogle Scholar
  28. PALERMO, F. P. 1974. A data base search problem. In Information Systems COINS IV, J. T. Tou, Ed. Plenum Press, New York, NY, 67-101.Google ScholarGoogle Scholar
  29. PELLENKOFT, A., GALINDO-LEGARIA,, C., AND KERSTEN, M. 1997. The complexity of transformation-based join enumeration. In Proceedings of the 23rd International Conference on Very Large Data Bases (VLDB '97, Athens, Greece, Aug.), 306-315. Google ScholarGoogle Scholar
  30. SCHEUFELE, W. AND MOERKOTTE, G. 1997. On the complexity of generating optimal plans with cross products (extended abstract). In Proceedings of the 16th ACM SIGACT-SIGMOD- SIGART Symposium on Principles of Database Systems (PODS '97, Tucson, AZ, May 12-14, 1997), A. Mendelzon and Z.M. zsoyoglu, Eds. ACM Press, New York, NY, 238-248. Google ScholarGoogle Scholar
  31. SELINGER, P. G., ASTRAHAN, M. M., LORIE, R. A., AND PRICE, T. G. 1979. Access path selection in a relational database management system. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD '79, Boston, MA, May 30-June 1), ACM Press, New York, NY, 23-34. Google ScholarGoogle Scholar
  32. SHEKITA, E. AND YOUNG, H. 1997. Iterative dynamic programming. U.S. patent 5671403.Google ScholarGoogle Scholar
  33. SHEKITA, E., YOUNG, H., AND TAN, K. -L. 1993. Multi-join optimization for symmetric multiprocessors. In Proceedings of the Conference on Very Large Data Bases (VLDB '93, Dublin, Ireland, Aug.), Morgan Kaufmann Publishers Inc., San Francisco, CA, 479-492. Google ScholarGoogle Scholar
  34. SIMMEN, D., SHEKITA, E., AND MALKEMUS, T. 1996. Fundamental techniques for order optimization. In Proceedings of the ACM-SIGMOD Conference on Management of Data (Montreal, Canada, June), ACM, New York, NY, 57-67. Google ScholarGoogle Scholar
  35. STEINBRUNN, M., MOERKOTTE, G., AND KEMPER, A. 1997. Heuristic and randomized optimization for the join ordering problem. VLDB J. 6, 3 (Aug.), 191-208. Google ScholarGoogle Scholar
  36. STONEBRAKER, M., AOKI, P., LITWIN, W., PFEFFER, A., SAH, A., SIDELL, J., STAELIN, C., AND YU, A. 1996. Mariposa: A wide-area distributed database system. VLDB J. 5, 1 (Jan.), 48-63. Google ScholarGoogle Scholar
  37. SWAMI, A. 1989. Optimization of large join queries: Combining heuristics and combinational techniques. In Proceedings of the ACM Conference on Management of Data (SIGMOD '89, Portland, OR, May), ACM Press, New York, NY, 367-376. Google ScholarGoogle Scholar
  38. SWAMI, A. 1991. Distributions of query plan costs for large join queries. Tech. Rep. RJ 72891. IBM Almaden Research Center.Google ScholarGoogle Scholar
  39. SWANI, A. AND GUPTA, A. 1988. Optimization of large join queries. In Proceedings of the ACM SIGMOD Conference on Management ofData (SIGMOD '88, Chicago, IL, June 1-3), H. Boral and P.-A. Larson, Eds. ACM Press, New York, NY, 8-17. Google ScholarGoogle Scholar
  40. SWAMI, A. AND IYER, B. 1993. A polynomial time algorithm for optimizing join queries. In Proceedings of the 9th International Conference on Data Engineering (Vienna, Austria, Apr.), IEEE Computer Society, Washington, DC, 345-354. Google ScholarGoogle Scholar
  41. VANCE, B. 1998. Join-order optimization with Cartesian products. Ph.D. Dissertation. Oregon Graduate Institute of Science & Technology, Beaverton, OR. Google ScholarGoogle Scholar
  42. VANCE, B. AND MAIER, D. 1996. Rapid bushy join-order optimization with Cartesian product. In Proceedings of the ACM-SIGMOD Conference on Management of Data (Montreal, Canada, June), ACM, New York, NY, 35-46. Google ScholarGoogle Scholar

Index Terms

  1. Iterative dynamic programming: a new class of query optimization algorithms

          Recommendations

          Reviews

          Alan Raymond Hevner

          The authors propose an intriguing approach for query optimization on distributed database systems, iterative dynamic programming (IDP). This heuristic approach consists of a series of dynamic programming stages in the process of query optimization. The classic dynamic programming algorithm for the optimization of a query with n joins is to generate all two-way join plans, then all three-way join plans, continuing until all n -way join plans are fully enumerated. The optimal query-processing plan is selected for execution. However, the full dynamic programming approach has exponential time and space complexity, making it infeasible for optimizing all but small queries. The essence of the new heuristic is that instead of fully enumerating all query processing plans, a resource limit is established (defined by a parameter k ). During each dynamic programming stage, all query processing plans are enumerated up to k -way joins. At that point, one or more of the best plans are chosen to initiate the next stage of dynamic programming, which also produces query-processing plans until the resource limit is reached. The sequence of dynamic programming stages continues until a complete plan is generated. The authors identify a number of important variations of the IDP approach: The system administrator can initially set the resource limit, or the IDP algorithm can automatically evaluate when a resource limit is being approached and can terminate that dynamic programming stage. Based on expert rules, more than one query-processing plan can be retained at the completion of each dynamic programming stage. This allows a wider range of options to be explored during the next optimization stage. The selection of good plans can be biased to retain balanced query trees over skewed query trees. This is found to result in improved query-processing plans in future stages. The criteria for goodness applied in the selection of the best query processing plans can be altered to match a wide range of optimization goals. An outstanding aspect of this paper is the thorough experimental evaluations of the IDP algorithms. The authors clearly demonstrate the time and space advantages of the IDP approach over classic dynamic programming and over other distributed query optimization heuristics proposed in the research literature. The query processing plans generated by the IDP algorithms are shown to compare favorably with the other approaches. Comparisons of the IDP variants provide important insights into the design tradeoffs of the query optimization algorithms. I recommend this well-written paper to researchers in distributed query optimization and to system developers who are looking for a new approach to optimizing complex queries on large distributed database systems.

          Access critical reviews of Computing literature here

          Become a reviewer for Computing Reviews.

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader