ABSTRACT
Most contemporary database systems perform cost-based join enumeration using some variant of System-R's bottom-up dynamic programming method. The notable exceptions are systems based on the top-down transformational search of Volcano/Cascades. As recent work has demonstrated, bottom-up dynamic programming can attain optimality with respect to the shape of the join graph; no comparable results have been published for transformational search. However, transformational systems leverage benefits of top-down search not available to bottom-up methods.
In this paper we describe a top-down join enumeration algorithm that is optimal with respect to the join graph. We present performance results demonstrating that a combination of optimal enumeration with search strategies such as branch-and-bound yields an algorithm significantly faster than those previously described in the literature. Although our algorithm enumerates the search space top-down, it does not rely on transformations and thus retains much of the architecture of traditional dynamic programming. As such, this work provides a migration path for existing bottom-up optimizers to exploit top-down search without drastically changing to the transformational paradigm.
- R. Ahmed, A. Lee, A. Witkowski, D. Das, H. Su, M. Zait, and T. Cruanes. Cost-based query transformation in Oracle. In Proc. 32nd International Conference on Very Large Data Bases, pages 1026--1036, 2006. Google ScholarDigital Library
- A. V. Aho, J. E. Hopcroft, and J. D. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley, 1974. Google ScholarDigital Library
- I. T. Bowman and G. N. Paulley. Join enumeration in a memory-constrainted environment. In Proc. 16th International Conference on Data Engineering, pages 645--654, 2000. Google ScholarDigital Library
- S. Chaudhuri, R. Krishnamurthy, S. Potamianos, and K. Shim. Optimizing queries with materialized views. In Proc. 11th International Conference on Data Engineering, pages 190--200, 1995. Google ScholarDigital Library
- D. DeHaan and F. W. Tompa. Optimal top-down join enumeration (extended version). Technical Report CS-2007-02, School of Computer Science, University of Waterloo, 2007.Google Scholar
- H. Garcia-Molina, J. D. Ullman, and J. Widom. Database Systems: The Complete Book. Prentice Hall PTR, 2001. Google ScholarDigital Library
- G. Graefe. The Cascades framework for query optimization. IEEE Data Engineering Bulletin, 18(3):19--29, 1995.Google Scholar
- G. Graefe. The Microsoft relational engine. In Proc. 12th International Conference on Data Engineering, pages 160--161, 1996. Google ScholarDigital Library
- G. Graefe and W. J. McKenna. The Volcano optimizer generator: Extensibility and efficient search. In Proc. 9th International Conference on Data Engineering, pages 209--218, 1993. Google ScholarDigital Library
- N. Kabra and D. J. DeWitt. OPT++: An object-oriented implementation for extensible database query optimization. VLDB J., 8(1):55--78, 1999. Google ScholarDigital Library
- W. J. McKenna, L. Burger, C. Hoang, and M. Truong. EROC: A toolkit for building NEATO query optimizers. In Proc. 22nd International Conference on Very Large Data Bases, pages 111--121, 1996. Google ScholarDigital Library
- G. Moerkotte and T. Neumann. Analysis of two existing and one new dynamic programming algorithm for the generation of optimal bushy join trees without cross products. In Proc. 32nd International Conference on Very Large Data Bases, pages 930--941, 2006. Google ScholarDigital Library
- E. F. Moore and C. E. Shannon. Reliable circuits using less reliable relays. J. Franklin Institute, 262(3,4):191--208, 281--297, 1956.Google ScholarCross Ref
- S. Morishita. Avoiding cartesian products for multiple joins. J. ACM, 44(1):57--85, 1997. Google ScholarDigital Library
- K. Ono and G. M. Lohman. Measuring the complexity of join enumeration in query optimization. In Proc. 16th International Conference on Very Large Data Bases, pages 314--325, 1990.Google ScholarDigital Library
- A. Pellenkoft, C. A. Galindo-Legaria, and M. L. Kersten. Complexity of transformation-based optimizers and duplicate-free generation of alternatives. Technical Report CS-R9639, CWI, 1996. Google ScholarDigital Library
- A. Pellenkoft, C. A. Galindo-Legaria, and M. L. Kersten. The complexity of transformation-based join enumeration. In Proc. 23rd International Conference on Very Large Data Bases, pages 306--315, 1997. Google ScholarDigital Library
- H. Pirahesh, J. M. Hellerstein, and W. Hasan. Extensible/rule based query rewrite optimization in Starburst. In Proc. ACM SIGMOD International Conference on Management of Data, pages 39--48, 1992. Google ScholarDigital Library
- J. S. Provan and D. R. Shier. A paradigm for listing(s,t)-cuts in graphs. Algorithmica, 15(4):351--372, April 1996.Google ScholarDigital Library
- P. G. Selinger, M. M. Astrahan, D. D. Chamberlin, R. A. Lorie, and T. G. Price. Access path selection in a relational database management system. In Proc. ACM SIGMOD International Conference on Management of Data, pages 23--34, 1979. Google ScholarDigital Library
- L. D. Shapiro, D. Maier, P. Benninghoff, K. Billings, Y. Fan, K. Hatwal, Q. Wang, Y. Zhang, H. min Wu, and B. Vance. Exploiting upper and lower bounds in top-down query optimization. In International Database Engineering & Applications Symposium, pages 20--33, 2001. Google ScholarDigital Library
- S. Tsukiyama, I. Shirakawa, H. Ozaki, and H. Ariyoshi. An algorithm to enumerate all cutsets of a graph in linear time per cutset. J. ACM, 27(4):619--632, 1980. Google ScholarDigital Library
- B. Vance and D. Maier. Rapid bushy join-order optimization with cartesian products. In Proc. ACM SIGMOD International Conference on Management of Data, pages 35--46, 1996. Google ScholarDigital Library
Index Terms
- Optimal top-down join enumeration
Recommendations
Join Enumeration in a Memory-Constrained Environment
ICDE '00: Proceedings of the 16th International Conference on Data EngineeringIn today's computing environment, database technology can be found on virtually any device, from traditional mainframes to cellular phones. Sophisticated applications, whether enterprise information portals or sales force automation systems, can `push' ...
SPARQL-to-SQL Query Translation: Bottom-Up or Top-Down?
SCC '11: Proceedings of the 2011 IEEE International Conference on Services ComputingEmerging Semantic Web Services rely on the availability of metadata that describes various functional and non-functional characteristics of computational resources. A number of semantic vocabularies and datasets describing existing services and ...
Efficient Top-k Query Answering through its Top-N Rewritings Using Views
PIKM '15: Proceedings of the 8th Workshop on Ph.D. Workshop in Information and Knowledge ManagementRecently, various algorithms were proposed to speed up top-k query answering by using multiple materialized query results. Nevertheless, for most of the proposed algorithms, a potentially costly view selection operation is required. In fact, the ...
Comments