ABSTRACT
Today's query processing engines do not take advantage of the multiple occurrences of a relation in a query to improve performance. Instead, each instance is treated as a distinct relation and has its own independent table access method. In this paper, we present MAPLE, a Multi-instance-Aware PLan Evaluation engine that enables multiple instances of a relation to share one physical scan (called SharedScan) with limited buffer space. During execution, as SharedScan pulls a tuple for any instance, that tuple is also pushed to the buffers of other instances with matching predicates. To avoid buffer overflow, a novel interleaved execution strategy is proposed: whenever an instance's buffer becomes full, the execution is temporarily switched to a drainer (an ancestor blocking operator of the instance) to consume all the tuples in the buffer. Thus, the execution is interleaved between normal processing and drainers. We also propose a cost-based approach to generate a plan to maximize the shared scan benefit as well as to avoid interleaved execution deadlocks. MAPLE is light-weight and can be easily integrated into existing RDBMS executors. We have implemented MAPLE in PostgreSQL, and our experimental study on the TPC-DS benchmark shows significant reduction in execution time.
- Microsoft SQL Server Library. http://msdn2.microsoft.com/en-us/library/bb545450.aspx.Google Scholar
- PostgreSQL. http://www.postgresql.org/.Google Scholar
- TPC BENCHMARK Decision Support. http://www.tpc.org/tpcds/.Google Scholar
- R. Bhashyam. TPC-D: the challenges, issues and results. SIGMOD Rec., 25(4):89--93, 1996. Google ScholarDigital Library
- E. I. Chong, S. Das, G. Eadon, and J. Srinivasan. An efficient sql-based rdf querying scheme. In VLDB, pages 1216--1227, 2005. Google ScholarDigital Library
- L. S. Colby, R. L. Cole, E. Haslam, N. Jazayeri, G. Johnson, W. J. McKenna, L. Schumacher, and D. Wilhite. Redbrick vista: Aggregate computation and management. In ICDE, pages 174--177, 1998. Google ScholarDigital Library
- N. Dalvi, S. Sanghai, P. Roy, and S. Sudarshan. Pipelining in multi-query optimization. Journal of Computer and System Sciences, 66(4):728--762, 2003. Google ScholarDigital Library
- G. Graefe. Query evaluation techniques for large databases. ACM Computing Surveys, 25(2):73--170, 1993. Google ScholarDigital Library
- T. Grust, M. V. Keulen, and J. Teubner. Accelerating xpath evaluation in any rdbms. ACM Trans. Database Syst., 29(1):91--131, 2004. Google ScholarDigital Library
- S. Harizopoulos, V. Shkapenyuk, and A. Ailamaki. Qpipe: a simultaneously pipelined relational query engine. In SIGMOD, pages 383--394, 2005. Google ScholarDigital Library
- C. A. Lang, B. Bhattacharjee, T. Malkemus, S. Padmanabhan, and K. Wong. Increasing buffer-locality for multiple relational table scans through grouping and throttling. In ICDE, pages 1136--1145, 2007.Google ScholarDigital Library
- C. A. Lang, B. Bhattacharjee, T. Malkemus, and K. Wong. Increasing buffer-locality for multiple index based scans through intelligent placement and index scan speed control. In VLDB, pages 1298--1309, 2007. Google ScholarDigital Library
- E. J. O'Neil, P. E. O'Neil, and G. Weikum. The lru-k page replacement algorithm for database disk buffering. In SIGMOD, pages 297--306, 1993. Google ScholarDigital Library
- P. Roy, S. Seshadri, S. Sudarshan, and S. Bhobe. Efficient and extensible algorithms for multi query optimization. In SIGMOD, pages 249--260, 2000. Google ScholarDigital Library
- T. Urhan, M. J. Franklin, and L. Amsaleg. Cost-based query scrambling for initial delays. SIGMOD Rec., 27(2):130--141, 1998. Google ScholarDigital Library
- K. Wilkinson, C. Sayers, H. A. Kuno, and D. Reynolds. Efficient rdf storage and retrieval in jena2. In SWDB, pages 131--150, 2003.Google Scholar
- Y. Zhao, P. M. Deshpande, J. F. Naughton, and A. Shukla. Simultaneous optimization and evaluation of multiple dimensional queries. In SIGMOD, pages 271--282, 1998. Google ScholarDigital Library
- J. Zhou, P.-A. Larson, J.-C. Freytag, and W. Lehner. Efficient exploitation of similar subexpressions for query processing. In SIGMOD, pages 533--544, 2007. Google ScholarDigital Library
- M. Zukowski, S. Heman, N. Nes, and P. Boncz. Cooperative scans: dynamic bandwidth sharing in a dbms. In VLDB, pages 723--734, 2007. Google ScholarDigital Library
Index Terms
- Optimizing complex queries with multiple relation instances
Recommendations
Efficient processing of monotonic linear progressive queries via dynamic materialized views
CASCON '10: Proceedings of the 2010 Conference of the Center for Advanced Studies on Collaborative ResearchThere is an increasing demand to process emerging types of queries, such as progressive queries (PQs), from numerous contemporary database applications including telematics, ecommerce, business intelligence, and decision support. Unlike a conventional ...
Equivalence and minimization of conjunctive queries under combined semantics
ICDT '12: Proceedings of the 15th International Conference on Database TheoryThe problems of query containment, equivalence, and minimization are fundamental problems in the context of query processing and optimization. In their classic work [2] published in 1977, Chandra and Merlin solved the three problems for the language of ...
Optimizing complex queries based on similarities of subqueries
As database technology is applied to more and more application domains, user queries are becoming increasingly complex (e.g. involving a large number of joins and a complex query structure). Query optimizers in existing database management systems (DBMS)...
Comments