ABSTRACT
The traditional focus of relational query optimization schemes has been on the choice of join methods and join orders. Restrictions have typically been handled in query optimizers by “predicate pushdown” rules, which apply restrictions in some random order before as many joins as possible. These rules work under the assumption that restriction is essentially a zero-time operation. However, today's extensible and object-oriented database systems allow users to define time-consuming functions, which may be used in a query's restriction and join predicates. Furthermore, SQL has long supported subquery predicates, which may be arbitrarily time-consuming to check. Thus restrictions should not be considered zero-time operations, and the model of query optimization must be enhanced.
In this paper we develop a theory for moving expensive predicates in a query plan so that the total cost of the plan — including the costs of both joins and restrictions — is minimal. We present an algorithm to implement the theory, as well as results of our implementation in POSTGRES. Our experience with the newly enhanced POSTGRES query optimizer demonstrates that correctly optimizing queries with expensive predicates often produces plans that are orders of magnitude faster than plans generated by a traditional query optimizer. The additional complexity of considering expensive predicates during optimization is found to be manageably small.
- CGK89.Danette Chimenti, Ruben Gamboa, and Ravi Krishnamurthy. Towards an Open Architecture for LDL. In Proc. 15th International Conference on Very Large Data Bases, Amsterdam, August 1989. Google ScholarDigital Library
- D+90.O, Deux et al. The Story of 02. IEEE Transactions on Knowledge and Data Engineering, 2(1), March 1990. Google ScholarDigital Library
- Day87.Umeshwar Dayal. Of Nests and Trees: A Unified Approach to #sing Queries that Contain Nested Subqueries, Aggregates, and Quantifiers. In Proc. VLDB 87 {Pro87}, pages 197-208. Google ScholarDigital Library
- Han77.Michael Z. Hanani. An Optimal Evaluation of Boolean Expressions in an Online Query System. Communications of the ACM, 20(5):344-347, may 1977. Google ScholarDigital Library
- HCL+90.L.M. Haas, W. Chang, G.M. Lohman, J. McPherson, P.F. Wilms, G. Lapis, B. Lindsay, H. Pirahesh, M. Carey, and E. Shekita. Starburst Mid-Flight: As the Dust Clears. IEEE Transactions on Knowledge and Data Engineering, pages 143-160, March 1990. Google ScholarDigital Library
- Hel92.Joseph M. Hellerstein. Predicate Migration: Optimizing Queries With Expensive Predicates. Technical Retxm Sequoia 2000 92/13, University of California, Berkeley, December 1992. Google ScholarDigital Library
- HOT88.Wen-Chi Hou, Gultekin Ozsoyoglu, and Baldeao K. Taneja. Statistical Estimators for Relational Algebra Expressions. In Proc. 7th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 276-287, Austin, March 1988. Google ScholarDigital Library
- IK84.Toshihide Ibaraki and Tiko Kameda. Optimal Nesting for Computing N-relational Joins. ACM Transaction; on Database Systems, 9(3):482-502, October 1984. Google ScholarDigital Library
- ISO91.ISO.ANSI. Database Language SQL ISO/IEC 9075:1992, 1991.Google Scholar
- Jhi88.Anant Jhingran. A Performance Study of Query Optimization Algorithms on a Database System Sung Procedures. In Pro~. VLDB 88 {Pro88}. Google ScholarDigital Library
- KBZ86.Ravi Krishnamunhy, Haran Boral, and Carlo Zaniolo. Optimization of Nonrecursive Queries. In Proc. 12th International Conference on Very Large Data Bases, pages 128-137, Kyoto, August 1986. Google ScholarDigital Library
- LDH+87.Guy M. Lohman, Dean Daniels, Laura M. Haas, Ruth Kistler, and Patricia G. Selinger. Optimization of Nested Queries in a Distributed Relational Database. in Proc. VLDB 87 ff#87. Google ScholarDigital Library
- LNSS93.Richard J. Lipton, Jeffrey F. Naughton, Donovan A. Schneider, and S. Seshadri. Efficient Sampling Strategies for Relational Database Operations. To appear in Theoretical Computer Science, 1993. Google ScholarDigital Library
- LS88.C. Lynch and M. Stonebraker.Extended User-Defined Indexing with Application to Textual Databases. In Proc. VLDB 88 tPro881. Google ScholarDigital Library
- Mos90.Claire Mosher (ed.). The POSTGRES Reference Manual, Volume 2. Technical Rqx# M90/53, Electronics Research Laboratory, University of California, Berkeley, July 1990.Google Scholar
- MS79.C. L. Monma and j.B. Sidney. Sequencing with Series- Parallel Precedence Constraints. Mathematics of Operations Research, 4:215-224, 1979.Google ScholarDigital Library
- MS86.D. Maier and J. Stein. Indexing in an Object-Oriented DBMS. In Klaus R. Dittrich and Umeshwar Dayal, editors, Proc. Workshop on Object-Oriented Database Systems, Asilomar, September 1986. Google ScholarDigital Library
- MS87.D. Maier and J. Stein. Development and Implementation of an Object-Oriented DBMS. In Bruce Shrivcr and Peter Wegner, editors, Research Directions in Object-Oriented Programming. MIT Press, 1987. Google ScholarDigital Library
- Ols92.MichaelA. Olson. Extending the POSTGRES Database System to Manage Tertiary Storage. Master's thesis, University of California, Berkeley, May 1992.Google Scholar
- O'N89.E O'Neil. Revisiting DBMS Benchmarks. Datamation, pages 47-54, September 15, 1989.Google Scholar
- ONT92.ONTOS, Inc. ONTOS Object SQL Guide, February 1992. For the ONTOS DB database, Release 2.2.Google Scholar
- PHH92.Hamid Pirahesh, Joseph M. Hellerstein, and Waqar Hasan. Extensible#ule-Basexl Query Rewrite Optimization in Staxburst, in Proc. ACM-SIGMOD International Conference on Management of Data, pages 39-48, San Diego, June 1992, Google ScholarDigital Library
- Pro87.Proc. 13th International Conference on Very Large Data Bases, Brighton, September 1987.Google Scholar
- Pro88.Proc. 14th International Conference on Very Large Data Bases, Los Angeles, August-September 1988.Google Scholar
- RS87.L.A. Rowe and M.R. Stonebraker. The POSTGRES Data Model. in Proc. VLDB 87 {F#87}, pages 83-96. Google ScholarDigital Library
- SAC+79.Patricia G. Selinger, M. Astrahan, D. Chamberlin, Raymond Lorie, and T. Price. Access Path Selection in a Relational Database Management System. In Proc. ACM- SIGMOD International Conference on Management of Data, Boston, June 1979. Google ScholarDigital Library
- SD92.Michael Stonebraker and Jeff Dozier. Sequoia 2000: Large Capacity Object Servers to Support Global Change Research. Technical Report Sequoia 2000 91/1, University of California, Berkeley, March 1992. Google ScholarDigital Library
- SFGM93.Michael Stonebraker, James Frew, Kenn Cmrdels, and Jeff Meredith. The Sequoia 2000 Storage Benchmark. In Proc. ACM-SIGMOD International Conference on Management of Data, Washington, D.C., May 1993. Google ScholarDigital Library
- SI92.Arun Swami and Balakrishna R. Iyer. A Polynomial Time Algorithm for Optimizing Join Queries. Research Report RJ 8812, IBM Almaden Research Center, June 1992.Google Scholar
- Smi56.W.E. Smith. Various Optimizers For Single-Stage Production. Naval Res. Logist. Quart., 3:59--66,1956.Google Scholar
- SR86.M#R. Stonebraker and L.A. Rowe. The Design of POST- GRES. In Proc. ACM-SIGMOD International Conference on Management of Data, Washington, D.C., May 1986. Google ScholarDigital Library
- Sto91.Michael Stonebraker. Managing Persistent Objects in a Multi-Level Store. In Proc. ACM-SIGMOD International Conference on Management of Data, pages 2-11, Denver, June 1991. Google ScholarDigital Library
- TOB89.C. Turbyfill, C. Orji, and Dina Bitton. AS3AP - A Comparative Relational Database Benchmark. In Proc. IEEE Compcon Spring '89, February 1989.Google Scholar
- Ull89.Jeffrey D. UUman. Principles of Database and Knowledge- Base Systems, volume 2. Computer Science Press, 1989.Google Scholar
- WLH90.K. Wilkinson, P. Lyngbaek, and W. Hasan. The Iris Architecture and Implementation. IEEE Transactions on Knowledge and Data Engineering, 2(1), March 1990. Google ScholarDigital Library
Index Terms
- Predicate migration: optimizing queries with expensive predicates
Recommendations
Predicate migration: optimizing queries with expensive predicates
The traditional focus of relational query optimization schemes has been on the choice of join methods and join orders. Restrictions have typically been handled in query optimizers by “predicate pushdown” rules, which apply restrictions in some random ...
Exploiting predicate-window semantics over data streams
The continuous sliding-window query model is used widely in data stream management systems where the focus of a continuous query is limited to a set of the most recent tuples. In this paper, we show that an interesting and important class of queries ...
Comments