Abstract
Traditional database systems are built around the query-at-a-time model. This approach tries to optimize performance in a best-effort way. Unfortunately, best effort is not good enough for many modern applications. These applications require response time guarantees in high load situations. This paper describes the design of a new database architecture that is based on batching queries and shared computation across possibly hundreds of concurrent queries and updates. Performance experiments with the TPC-W benchmark show that the performance of our implementation, SharedDB, is indeed robust across a wide range of dynamic workloads.
- S. Arumugam, A. Dobra, C. M. Jermaine, N. Pansare, and L. Perez. The DataPath System: A Data-Centric Analytic Processing Engine for Large Data Warehouses. In Proc. SIGMOD, pages 519--530, 2010. Google ScholarDigital Library
- R. Avnur and J. M. Hellerstein. Eddies: Continuously Adaptive Query Processing. In Proc. SIGMOD, pages 261--272, 2000. Google ScholarDigital Library
- G. Candea, N. Polyzotis, and R. Vingralek. A Scalable, Predictable Join Operator for Highly Concurrent Data Warehouses. In Proc. VLDB, pages 277--288, 2009. Google ScholarDigital Library
- G. Candea, N. Polyzotis, and R. Vingralek. Predictable Performance and High Query Concurrency for Data Analytics. VLDB Journal, 20(2):227--248, 2011. Google ScholarDigital Library
- S. Chaudhuri and V. R. Narasayya. An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server. In Proc. VLDB, pages 146--155, 1997. Google ScholarDigital Library
- N. N. Dalvi, S. K. Sanghai, P. Roy, and S. Sudarshan. Pipelining in Multi-Query Optimization. In Proc. PODS, pages 59--70, 2001. Google ScholarDigital Library
- S. Dar, M. J. Franklin, B. T. Jónsson, D. Srivastava, and M. Tan. Semantic Data Caching and Replacement. In Proc. VLDB, pages 330--341, 1996. Google ScholarDigital Library
- Y. Diao, M. Altinel, M. J. Franklin, H. Zhang, and P. Fischer. Path Sharing and Predicate Evaluation for High-Performance XML Filtering. ACM Trans. Database Systems, 28(4):467--516, 2003. Google ScholarDigital Library
- J.-P. Dittrich, B. Seeger, D. S. Taylor, and P. Widmayer. Progressive Merge Join: a Generic and Non-Blocking Sort-Based Join Algorithm. In Proc. VLDB, pages 299--310, 2002. Google ScholarDigital Library
- P. M. Fernandez. Red Brick Warehouse: A Read-Mostly RDBMS for Open SMP Platforms. In Proc. SIGMOD, page 492, 1994. Google ScholarDigital Library
- S. Finkelstein. Common Expression Analysis in Database Applications. In Proc. SIGMOD, pages 235--245, 1982. Google ScholarDigital Library
- P. M. Fischer and D. Kossmann. Batched Processing for Information Filters. In Proc. ICDE, pages 902--913, 2005. Google ScholarDigital Library
- L. M. Haas, J. C. Freytag, G. M. Lohman, and H. Pirahesh. Extensible Query Processing in Starburst. In Proc. SIGMOD, pages 377--388, 1989. Google ScholarDigital Library
- S. Harizopoulos and A. Ailamaki. StagedDB: Designing Database Servers for Modern Hardware. IEEE Data Eng. Bull., 28(2):11--16, 2005.Google Scholar
- S. Harizopoulos, V. Shkapenyuk, and A. Ailamaki. QPipe: A Simultaneously Pipelined Relational Query Engine. In Proc. SIGMOD, pages 383--394, 2005. Google ScholarDigital Library
- S. Helmer and G. Moerkotte. Evaluation of Main Memory Join Algorithms for Joins with Set Comparison Join Predicates. In Proc. VLDB, pages 386--395, 1997. Google ScholarDigital Library
- S. Héman, N. Nes, M. Zukowski, and P. Boncz. Vectorized Data Processing on the Cell Broadband Engine. In Proc. DaMoN, pages 4:1--4:6, 2007. Google ScholarDigital Library
- M. G. Ivanova, M. L. Kersten, N. J. Nes, and R. A. Gonçalves. An Architecture for Recycling Intermediates in a Column-Store. In Proc. SIGMOD, pages 309--320, 2009. Google ScholarDigital Library
- C. Kim, T. Kaldewey, V. W. Lee, E. Sedlar, A. D. Nguyen, N. Satish, J. Chhugani, A. Di Blas, and P. Dubey. Sort vs. Hash Revisited: Fast Join Implementation on Modern Multi-Core CPUs. In Proc. VLDB, pages 1378--1389, 2009. Google ScholarDigital Library
- D. Kossmann, M. J. Franklin, G. Drasch, and W. Ag. Cache Investment: Integrating Query Optimization and Distributed Data Placement. ACM Trans. Database Systems, 25(4):517--558, 2000. Google ScholarDigital Library
- L. Qiao, V. Raman, F. Reiss, P. J. Haas, and G. M. Lohman. Main-Memory Scan Sharing for Multi-Core CPUs. In Proc. VLDB, pages 610--621, 2008. Google ScholarDigital Library
- V. Raman, G. Swart, L. Qiao, F. Reiss, V. Dialani, D. Kossmann, I. Narang, and R. Sidle. Constant-Time Query Processing. In Proc. ICDE, pages 60--69, 2008. Google ScholarDigital Library
- T.-I. Salomie, I. E. Subasu, J. Giceva, and G. Alonso. Database Engines on Multicores, Why Parallelize when you can Distribute? In Proc. EuroSys, pages 17--30, 2011. Google ScholarDigital Library
- P. G. Selinger, M. M. Astrahan, D. D. Chamberlin, R. A. Lorie, and T. G. Price. Access Path Selection in a Relational Database Management System. In Proc. SIGMOD, pages 23--34, 1979. Google ScholarDigital Library
- T. K. Sellis. Multiple-Query Optimization. ACM Trans. Database Systems, 13(1):23--52, 1988. Google ScholarDigital Library
- L. D. Shapiro. Join Processing in Database Systems with Large Main Memories. ACM Trans. Database Systems, 11(3):239--264, 1986. Google ScholarDigital Library
- M. Stonebraker, S. Madden, D. J. Abadi, S. Harizopoulos, N. Hachem, and P. Helland. The End of an Architectural Era: (It's Time for a Complete Rewrite). In Proc. VLDB, pages 1150--1160, 2007. Google ScholarDigital Library
- P. Unterbrunner, G. Giannikis, G. Alonso, D. Fauser, and D. Kossmann. Predictable Performance for Unpredictable Workloads. In Proc. VLDB, pages 706--717, 2009. Google ScholarDigital Library
- M. Zukowski, S. Héman, N. Nes, and P. Boncz. Cooperative Scans: Dynamic Bandwidth Sharing in a DBMS. In Proc VLDB, pages 723--734, 2007. Google ScholarDigital Library
Recommendations
Mark-copy: fast copying GC with less space overhead
OOPSLA '03: Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applicationsCopying garbage collectors have a number of advantages over non-copying collectors, including cheap allocation and avoiding fragmentation. However, in order to provide completeness (the guarantee to reclaim each garbage object eventually), standard ...
Mark-copy: fast copying GC with less space overhead
Special Issue: Proceedings of the OOPSLA '03 conferenceCopying garbage collectors have a number of advantages over non-copying collectors, including cheap allocation and avoiding fragmentation. However, in order to provide completeness (the guarantee to reclaim each garbage object eventually), standard ...
A generational on-the-fly garbage collector for Java
PLDI '00: Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementationAn on-the-fly garbage collector does not stop the program threads to perform the collection. Instead, the collector executes in a separate thread (or process) in parallel to the program. On-the-fly collectors are useful for multi-threaded applications ...
Comments