ABSTRACT
Performance of query processing functions in a DBMS can be affected by many factors, including the hardware platform, data distributions, predicate parameters, compilation method, algorithmic variations and the interactions between these. Given that there are often different function implementations possible, there is a latent performance diversity which represents both a threat to performance robustness if ignored (as is usual now) and an opportunity to increase the performance if one would be able to use the best performing implementation in each situation. Micro Adaptivity, proposed here, is a framework that keeps many alternative function implementations (flavors) in a system. It uses a learning algorithm to choose the most promising flavor potentially at each function call, guided by the actual costs observed so far. We argue that Micro Adaptivity both increases performance robustness, and saves development time spent in finding and tuning heuristics and cost model thresholds in query optimization. In this paper, we (i) characterize a number of factors that cause performance diversity between primitive flavors, (ii) describe an e-greedy learning algorithm that casts the flavor selection into a multi-armed bandit problem, and (iii) describe the software framework for Micro Adaptivity that we implemented in the Vectorwise system. We provide micro-benchmarks, and an overall evaluation on TPC-H, showing consistent improvements.
- C. W. Antoine, A. Petitet, and J. J. Dongarra. Automated empirical optimization of software and the atlas project. Parallel Computing, 27:2001, 2000.Google Scholar
- P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2-3):235--256, May 2002. Google ScholarDigital Library
- P. Boncz, M. Zukowski, and N. Nes. MonetDB/X100: Hyper-Pipelining Query Execution. In CIDR, 2005.Google Scholar
- S. Chen, A. Ailamaki, P. B. Gibbons, and T. C. Mowry. Improving hash join performance through prefetching. TODS, 32(3), Aug. 2007. Google ScholarDigital Library
- Datta, et. al. Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures. In ACM/IEEE Conf. on Supercomputing, 2008. Google ScholarDigital Library
- A. Deshpande, Z. Ives, and V. Raman. Adaptive query processing. Foundations and Trends in Databases, 1(1):1--140, Jan. 2007. Google ScholarDigital Library
- M. Frigo and S. G. Johnson. FFTW: An adaptive software architecture for the FFT. In Conf. on Acoustics, Speech and Signal Processing, 1998.Google ScholarCross Ref
- T. Lai and H. Robbins. Asymptotically efficient adaptive allocation rules. Advances in Applied Math., 6:4--22, 1985. Google ScholarDigital Library
- X. Li, M. J. Garzaran, and D. Padua. A dynamically tuned sorting library, 2004.Google Scholar
- S. Madden, M. Shah, J. Hellerstein, and V. Raman. Continuously adaptive continuous queries over streams. In SIGMOD, 2002. Google ScholarDigital Library
- S. Padmanabhan, T. Malkemus, R. Agarwal, and A. Jhingran. Block oriented processing of relational database operations in modern computer architectures. In ICDE, 2001. Google ScholarDigital Library
- H. Robbins. Some aspects of the sequential design of experiments. Bulletin of the AMS, 58:527--535, 1952.Google ScholarCross Ref
- K. A. Ross. Selection conditions in main memory. TODS, 29(1):132--161, Mar. 2004. Google ScholarDigital Library
- J. Sompolski, M. Zukowski, and P. Boncz. Vectorization vs. compilation in query execution. In DAMON, 2011. Google ScholarDigital Library
- J. Vermorel and M. Mohri. Multi-armed bandit algorithms and empirical evaluation. In In European Conference on Machine Learning, pages 437--448. Springer, 2005. Google ScholarDigital Library
- C. Watkins. Learning from Delayed Rewards. PhD thesis, University of Cambridge,England, 1989.Google Scholar
- M. Zukowski. Balancing Vectorized Query Execution With Bandwidth-Optimized Storage. PhD thesis, Universiteit van Amsterdam, September 2009.Google Scholar
- M. Zukowski and P. Boncz. Vectorwise: Beyond column stores. IEEE Data Eng. Bull., 35(1):21--27, 2012.Google Scholar
Index Terms
- Micro adaptivity in Vectorwise
Recommendations
Combining Joint and Semi-Join Operations for Distributed Query Processing
The application of a combination of join and semi-join operations to minimize the amount of data transmission required for distributed query processing is discussed. Specifically, two important concepts that occur with the use of join operations as ...
Processing Aggregate Queries with Materialized Views in Data Warehouse Environment
Materialized views, which are derived from base relations and stored in the database, offer opportunities for significant performance gain in query evaluation by providing quick access to the pre-computed data. A materialized view can be utilized in ...
Query processing over object views of relational data
This paper presents an approach to object view management for relational databases. Such a view mechanism makes it possible for users to transparently work with data in a relational database as if it was stored in an object-oriented (OO) database. A ...
Comments