ABSTRACT
Modern query optimizers need to take into account the performance of expensive user-defined predicates. Existing research has shown how to incorporate such predicates in a traditional cost-based query optimizer. In this paper we deal with the optimization of the expensive predicates themselves, showing how their cost can be reduced by utilizing cheaper, but less accurate, versions of the predicates to pre-filter tuples. We discuss the generalized tuple handling mechanism, which processes tuples along a fixed sequence of versions, as well as adaptive approaches that either split tuple streams into groups, or make routing decisions at the individual tuple level. We identify the lower bound to the problem of evaluating a multi-version selection predicate by an ideal individualized plan (IIP), and develop an optimal generalized plan (OGP). We then show how realistic individualized or grouped schemes can produce an intermediate cost between OGP and IIP, if tuples substantially deviate from the average stream behavior. Our algorithms are tested experimentally, identifying many of the issues that arise whenever multi-version predicates are used.
- R. Avnur and J. M. Hellerstein. Eddies: Continuously adaptive query processing. In Proc. of SIGMOD Conference, 2000.Google ScholarDigital Library
- S. Babu, R. Motwani, K. Munagala, I. Nishizawa, and J. Widom. Adaptive ordering of pipelined stream filters. In Proc. of SIGMOD Conference, 2004. Google ScholarDigital Library
- P. Bizarro, S. Babu, D. DeWitt, and J. Widom. Content-based routing: different plans for different data. In Proc. of VLDB Conference, 2005. Google ScholarDigital Library
- T. Brinkhoff, H. P. Kriegel, R. Schneider, and B. Seeger. Multi-step Processing of Spatial Joins In Proc. of SIGMOD Conference, 1994.Google Scholar
- S. Chaudhuri and K. Shim. Optimization of queries with user-defined predicates. ACM Trans. Database Syst., 24(2):177--228, 1999. Google ScholarDigital Library
- J. M. Hellerstein. Optimization techniques for queries with expensive methods. ACM Trans. Database Syst., 23(2):113--157, 1998. Google ScholarDigital Library
- J. M. Hellerstein and M. Stonebraker. Predicate migration: optimizing queries with expensive predicates. In Proc. of SIGMOD Conference, 1993.Google ScholarDigital Library
- A. Kemper, G. Moerkotte, K. Peithner, and M. Steinbrunn. Optimizing disjunctive queries with expensive predicates. In Proc. of SIGMOD Conference, 1994.Google ScholarDigital Library
- I. Lazaridis and S. Mehrotra. Approximate selection queries over imprecise data. In Proc. of ICDE Conference, 2004. Google ScholarDigital Library
- R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2006. ISBN 3-900051-07-0.Google Scholar
- P. G. Selinger, M. M. Astrahan, D. D. Chamberlin, R. A. Lorie, and T. G. Price. Access path selection in a relational database management system. In Proc. of SIGMOD Conference, 1979.Google ScholarDigital Library
- N. Tatbul and S. Zdonik. Window-aware load shedding for aggregation queries over data streams. In Proc. of VLDB Conference, 2006. Google ScholarDigital Library
- E. W. Weisstein. Wiener process. From MathWorld - A Wolfram Web Resource. http://mathworld.wolfram.com/WienerProcess.html.Google Scholar
Index Terms
- Optimization of multi-version expensive predicates
Recommendations
Optimization of queries with user-defined predicates
Relational databases provide the ability to store user-defined functions and predicates which can be invoked in SQL queries. When evaluation of a user-defined predicate is relatively expensive, the traditional method of evaluating predicates as early as ...
Optimization techniques for queries with expensive methods
Object-relational database management systems allow knowledgeable users to define new data types as well as new methods (operators) for the types. This flexibility produces an attendant complexity, which must be handled in new ways for an object-...
Dynamically optimizing queries over large scale data platforms
SIGMOD '14: Proceedings of the 2014 ACM SIGMOD International Conference on Management of DataEnterprises are adapting large-scale data processing platforms, such as Hadoop, to gain actionable insights from their "big data". Query optimization is still an open challenge in this environment due to the volume and heterogeneity of data, comprising ...
Comments