Abstract
Vertical partitioning is a crucial step in physical database design in row-oriented databases. A number of vertical partitioning algorithms have been proposed over the last three decades for a variety of niche scenarios. In principle, the underlying problem remains the same: decompose a table into one or more vertical partitions. However, it is not clear how good different vertical partitioning algorithms are in comparison to each other. In fact, it is not even clear how to experimentally compare different vertical partitioning algorithms. In this paper, we present an exhaustive experimental study of several vertical partitioning algorithms. We categorize vertical partitioning algorithms along three dimensions. We survey six vertical partitioning algorithms and discuss their pros and cons. We identify the major differences in the use-case settings for different algorithms and describe how to make an apples-to-apples comparison of different vertical partitioning algorithms under the same setting. We propose four metrics to compare vertical partitioning algorithms. We show experimental results from the TPC-H and SSB benchmark and present four key lessons learned: (1) we can do four orders of magnitude less computation and still find the optimal layouts, (2) the benefits of vertical partitioning depend strongly on the database buffer size, (3) HillClimb is the best vertical partitioning algorithm, and (4) vertical partitioning for TPC-H-like benchmarks can improve over column layout by only up to 5%.
- S. Agrawal, V. Narasayya, and B. Yang. Integrating Vertical and Horizontal Partitioning Into Automated Physical Database Design. In ACM SIGMOD, pages 359-370, 2004. Google Scholar
- Bonnie++, coker.com.au/bonnie++.Google Scholar
- W. W. Chu and I. T. Ieong. A Transaction-Based Approach to Vertical Partitioning for Relational Database Systems. IEEE Trans. Softw. Eng., 19(8):804-812, 1993. Google Scholar
- D. W. Cornell and P. S. Yu. A Vertical Partitioning Algorithm for Relational Databases. In ICDE, pages 30-35, 1987. Google Scholar
- D. W. Cornell and P. S. Yu. An Effective Approach to Vertical Partitioning for Physical Design of Relational Databases. IEEE Trans. Softw. Eng., 16(2):248-258, 1990. Google Scholar
- M. Grund, J. Krüger, H. Plattner, A. Zeier, P. Cudre-Mauroux, and S. Madden. HYRISE: A Main Memory Hybrid Storage Engine. PVLDB, 4(2):105-116, 2010. Google Scholar
- M. Hammer and B. Niamir. A Heuristic Approach to Attribute Partitioning. In ACM SIGMOD, pages 93-101, 1979. Google Scholar
- R. A. Hankins and J. M. Patel. Data Morphing: An Adaptive, Cache-Conscious Storage Technique. In VLDB, pages 417-428, 2003. Google Scholar
- HBase, hbase.apache.org.Google Scholar
- J. A. Hoffer and D. G. Severance. The Use of Cluster Analysis in Physical Data Base Design. In VLDB, pages 69-86, 1975. Google Scholar
- A. Jindal and J. Dittrich. Relax and let the database do the partitioning online. In BIRTE, pages 65-80, 2011.Google Scholar
- A. Jindal, J.-A. Quiané-Ruiz, and J. Dittrich. Trojan Data Layouts: Right Shoes for a Running Elephant. In ACM SOCC, pages 21:1-21:14, 2011. Google Scholar
- A. Jindal, F. M. Schuhknecht, J. Dittrich, K. Khachatryan, and A. Bunte. How Achaeans Would Construct Columns in Troy. In CIDR, 2013.Google Scholar
- W. T. M. Jr., P. J. Schweitzer, and T. W. White. Problem Decomposition and Data Reorganization by a Clustering Technique. Operations Research, 20(5):993-1009, 1972.Google Scholar
- S. Navathe, S. Ceri, G. Wiederhold, and J. Dou. Vertical Partitioning Algorithms for Database Design. ACM TODS, 9(4):680-710, 1984. Google Scholar
- S. B. Navathe and M. Ra. Vertical Partitioning for Database Design: A Graphical Algorithm. In ACM SIGMOD, pages 440-450, 1989. Google Scholar
- S. Papadomanolakis and A. Ailamaki. AutoPart: Automating Schema Design for Large Scientific Databases Using Data Partitioning. In SSDBM, pages 383-392, 2004. Google Scholar
- V. Raman, G. Swart, L. Qiao, F. Reiss, V. Dialani, D. Kossmann, I. Narang, and R. Sidle. Constant-Time Query Processing. In ICDE, pages 60-69, 2008. Google Scholar
- Star Schema Benchmark, www.cs.umb.edu/~poneil/StarSchemaB.PDF.Google Scholar
- Vertica, vertica.com.Google Scholar
Index Terms
- A comparison of knives for bread slicing
Recommendations
Visual comparison of hierarchically organized data
EuroVis'08: Proceedings of the 10th Joint Eurographics / IEEE - VGTC conference on VisualizationWe provide a novel visualization method for the comparison of hierarchically organized data. Our technique visualizes a pair of hierarchies that are to be compared and simultaneously depicts how these hierarchies are related by explicitly visualizing ...
Database slicing on relational databases
Many software systems today use databases to permanently store their data. Testing, bug finding and migration are complex problems in the case of databases that contain many records. Here, our method can speed up these processes if we can select a ...
Internet advertising strategy by comparison challenge approach
ICEC '03: Proceedings of the 5th international conference on Electronic commerceA comparison challenge approach is proposed as a form of challenger-activated, just-in-time advertising. To develop a framework for a comparison challenge, we propose a theory of comparison. Based on this theory, the CompareMe and CompareThem strategies ...
Comments