ABSTRACT
The advent of affordable, shared-nothing computing systems portends a new class of parallel database management systems (DBMS) for on-line transaction processing (OLTP) applications that scale without sacrificing ACID guarantees [7, 9]. The performance of these DBMSs is predicated on the existence of an optimal database design that is tailored for the unique characteristics of OLTP workloads. Deriving such designs for modern DBMSs is difficult, especially for enterprise-class OLTP systems, since they impose extra challenges: the use of stored procedures, the need for load balancing in the presence of time-varying skew, complex schemas, and deployments with larger number of partitions.
To this purpose, we present a novel approach to automatically partitioning databases for enterprise-class OLTP systems that significantly extends the state of the art by: (1) minimizing the number distributed transactions, while concurrently mitigating the effects of temporal skew in both the data distribution and accesses, (2) extending the design space to include replicated secondary indexes, (4) organically handling stored procedure routing, and (3) scaling of schema complexity, data size, and number of partitions. This effort builds on two key technical contributions: an analytical cost model that can be used to quickly estimate the relative coordination cost and skew for a given workload and a candidate database design, and an informed exploration of the huge solution space based on large neighborhood search. To evaluate our methods, we integrated our database design tool with a high-performance parallel, main memory DBMS and compared our methods against both popular heuristics and a state-of-the-art research prototype [17]. Using a diverse set of benchmarks, we show that our approach improves throughput by up to a factor of 16x over these other approaches.
- H-Store: Next Generation OLTP DBMS Research. http://hstore.cs.brown.edu.Google Scholar
- Wikipedia MySQL Server Roles. https://wikitech.wikimedia.org/view/Server_roles.Google Scholar
- S. Agrawal, S. Chaudhuri, A. Das, and V. Narasayya. Automating layout of relational databases. In ICDE, pages 607--618, 2003.Google Scholar
- S. Agrawal, S. Chaudhuri, and V. R. Narasayya. Automated selection of materialized views and indexes in SQL databases. In VLDB, 2000. Google ScholarDigital Library
- S. Agrawal, V. Narasayya, and B. Yang. Integrating vertical and horizontal partitioning into automated physical database design. In SIGMOD, 2004. Google ScholarDigital Library
- V. Angkanawaraphan and A. Pavlo. AuctionMark: A benchmark for high-performance oltp systems.Google Scholar
- M. Aslett. How will the database incumbents respond to NoSQL and NewSQL? The 451 Group, April 2011.Google Scholar
- N. Bassiliades and I. P. Vlahavas. A non-uniform data fragmentation strategy for parallel main-memory database systems. In VLDB, pages 370--381, 1995. Google ScholarDigital Library
- R. Cattell. Scalable SQL and NoSQL data stores. SIGMOD Rec., 39:12--27, 2011. Google ScholarDigital Library
- S. Ceri, S. Navathe, and G. Wiederhold. Distribution design of logical database schemas. IEEE Trans. Softw. Eng., 9(4):487--504, 1983. Google ScholarDigital Library
- S. Chaudhuri, A. K. Gupta, and V. Narasayya. Compressing SQL workloads. In SIGMOD, pages 488--499, 2002. Google ScholarDigital Library
- S. Chaudhuri and V. Narasayya. Autoadmin "what-if" index analysis utility. SIGMOD Rec., 27(2):367--378, 1998. Google ScholarDigital Library
- S. Chaudhuri and V. R. Narasayya. An efficient cost-driven index selection tool for microsoft SQL server. In VLDB, pages 146--155, 1997. Google ScholarDigital Library
- Y. C. Cheng, L. Gruenwald, G. Ingels, and M. T. Thakkar. Evaluating partitioning techniques for main memory database: Horizontal and single vertical. In ICCI, pages 570--574, 1993. Google ScholarDigital Library
- J. Coleman and R. Grosman. Unlimited Scale-up of DB2 Using Server-assisted Client Redirect. http://ibm.co/fLR2cH, October 2005.Google Scholar
- G. Copeland, W. Alexander, E. Boughter, and T. Keller. Data placement in bubba. SIGMOD, 17(3):99--108, 1988. Google ScholarDigital Library
- C. Curino, Y. Zhang, E. Jones, and S. Madden. Schism: a workload-drive approach to database replication and partitioning. In VLDB, 2010. Google ScholarDigital Library
- E. Danna and L. Perron. Structured vs. unstructured large neighborhood search: A case study on job-shop scheduling problems with earliness and tardiness costs. In Principles and Practice of Constraint Programming, volume 2833, pages 817--821, 2003.Google Scholar
- S. Duan, V. Thummala, and S. Babu. Tuning database configuration parameters with iTuned. VLDB, 2:1246--1257, August 2009. Google ScholarDigital Library
- S. Finkelstein, M. Schkolnick, and P. Tiberio. Physical database design for relational databases. ACM Trans. Database Syst., 13(1):91--128, 1988. Google ScholarDigital Library
- F. Focacci, F. Laburthe, and A. Lodi. Handbook of Metaheuristics, chapter Local Search and Constraint Programming. Springer, 2003.Google Scholar
- N. Folkman. So, that was a bummer. http://blog.foursquare.com/2010/10/05/so-that-was-a-bummer/, October 2010.Google Scholar
- S. Ghandeharizadeh, D. J. DeWitt, and W. Qureshi. A performance analysis of alternative multi-attribute declustering strategies. SIGMOD, 21(2):29--38, 1992. Google ScholarDigital Library
- S. Harizopoulos, D. J. Abadi, S. Madden, and M. Stonebraker. OLTP through the looking glass, and what we found there. In SIGMOD, pages 981--992, 2008. Google ScholarDigital Library
- E. P. Jones, D. J. Abadi, and S. Madden. Low overhead concurrency control for partitioned main memory databases. In SIGMOD, pages 603--614, 2010. Google ScholarDigital Library
- R. Kallman, H. Kimura, J. Natkins, A. Pavlo, A. Rasin, S. Zdonik, E. P. C. Jones, S. Madden, M. Stonebraker, Y. Zhang, J. Hugg, and D. J. Abadi. H-Store: A High-Performance, Distributed Main Memory Transaction Processing System. Proc. VLDB Endow., 1(2):1496--1499, 2008. Google ScholarDigital Library
- J. Krueger, C. Kim, M. Grund, N. Satish, D. Schwalb, J. Chhugani, H. Plattner, P. Dubey, and A. Zeier. Fast updates on Read-Optimized databases using Multi-Core CPUs. VLDB, 5:61--72, September 2011. Google ScholarDigital Library
- M. Livny, S. Khoshafian, and H. Boral. Multi-disk management algorithms. SIGMETRICS Perform. Eval. Rev., 15(1):69--77, 1987. Google ScholarDigital Library
- S. Manegold, P. Boncz, and M. L. Kersten. Generic database cost models for hierarchical memory systems. In VLDB, pages 191--202, 2002. Google ScholarDigital Library
- M. Mehta and D. J. DeWitt. Data placement in shared-nothing parallel database systems. The VLDB Journal, 6(1):53--72, 1997. Google ScholarDigital Library
- R. Mukkamala, S. C. Bruell, and R. K. Shultz. Design of partially replicated distributed database systems: an integrated methodology. SIGMETRICS Perform. Eval. Rev., 16(1):187--196, 1988. Google ScholarDigital Library
- R. Nehme and N. Bruno. Automated partitioning design in parallel database systems. In SIGMOD, SIGMOD, pages 1137--1148, 2011. Google ScholarDigital Library
- C. Nikolaou, A. Labrinidis, V. Bohn, D. Ferguson, M. Artavanis, C. Kloukinas, and M. Marazakis. The impact of workload clustering on transaction routing. Technical report, FORTH-ICS TR-238, 1998.Google Scholar
- C. N. Nikolaou, M. Marazakis, and G. Georgiannakis. Transaction routing for distributed OLTP systems: survey and recent results. Inf. Sci., 97:45--82, 1997. Google ScholarDigital Library
- S. Padmanabhan. Data placement in shared-nothing parallel database systems. PhD thesis, University of Michigan, 1992. Google ScholarDigital Library
- S. Papadomanolakis and A. Ailamaki. Autopart: Automating schema design for large scientific databases using data partitioning. In SSDBM, 2004. Google ScholarDigital Library
- A. Pavlo, E. P. Jones, and S. Zdonik. On predictive modeling for optimizing transaction execution in parallel OLTP systems. VLDB, 5:85--96, October 2011. Google ScholarDigital Library
- E. Rahm. A framework for workload allocation in distributed transaction processing systems. J. Syst. Softw., 18:171--190, May 1992. Google ScholarDigital Library
- J. Rao, C. Zhang, N. Megiddo, and G. Lohman. Automating physical database design in a parallel database. In SIGMOD, pages 558--569, 2002. Google ScholarDigital Library
- P. Scheuermann, G. Weikum, and P. Zabback. Data partitioning and load balancing in parallel disk systems. The VLDB Journal, 7(1):48--66, 1998. Google ScholarDigital Library
- J. Sobel. Scaling Out (Facebook). http://on.fb.me/p7i7eK, April 2006.Google Scholar
- M. Stonebraker and R. Cattell. 10 rules for scalable performance in "simple operation" datastores. Commun. ACM, 54:72--80, June 2011. Google ScholarDigital Library
- M. Stonebraker, S. Madden, D. J. Abadi, S. Harizopoulos, N. Hachem, and P. Helland. The end of an architectural era: (it's time for a complete rewrite). In VLDB, pages 1150--1160, 2007. Google ScholarDigital Library
- M. Stonebraker and A. Pavlo. The SEATS Airline Ticketing Systems Benchmark. http://hstore.cs.brown.edu/projects/seats.Google Scholar
- The Transaction Processing Council. TPC-C Benchmark (Revision 5.9.0). http://www.tpc.org/tpcc/, June 2007.Google Scholar
- The Transaction Processing Council. TPC-E Benchmark (Revision 1.23.0). http://www.tpc.org/tpce/, June 2010.Google Scholar
- C. B. Walton, A. G. Dale, and R. M. Jenevein. A taxonomy and performance model of data skew effects in parallel joins. In VLDB, pages 537--548, 1991. Google ScholarDigital Library
- A. Wolski. TATP Benchmark Description (Version 1.0). http://tatpbenchmark.sourceforge.net, March 2009.Google Scholar
- D. C. Zilio. Physical Database Design Decision Algorithms and Concurrent Reorganization for Parallel Database Systems. PhD thesis, University of Toronto, 1998. Google ScholarDigital Library
- D. C. Zilio, J. Rao, S. Lightstone, G. Lohman, A. Storm, C. Garcia-Arellano, and S. Fadden. DB2 design advisor: integrated automatic physical database design. In VLDB, pages 1087--1097, 2004. Google ScholarDigital Library
Index Terms
- Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems
Recommendations
Hekaton: SQL server's memory-optimized OLTP engine
SIGMOD '13: Proceedings of the 2013 ACM SIGMOD International Conference on Management of DataHekaton is a new database engine optimized for memory resident data and OLTP workloads. Hekaton is fully integrated into SQL Server; it is not a separate system. To take advantage of Hekaton, a user simply declares a table memory optimized. Hekaton ...
Workload-aware incremental repartitioning of shared-nothing distributed databases for scalable OLTP applications
On-line Transaction Processing (OLTP) applications often rely on shared-nothing distributed databases that can sustain rapid growth in data volume. Distributed transactions (DTs) that involve data tuples from multiple geo-distributed servers can ...
Is it DSS or OLTP: automatically identifying DBMS workloads
The type of the workload on a database management system (DBMS) is a key consideration in tuning the system. Allocations for resources such as main memory can be very different depending on whether the workload type is Online Transaction Processing (...
Comments