skip to main content
article

A comparison of knives for bread slicing

Published:01 April 2013Publication History
Skip Abstract Section

Abstract

Vertical partitioning is a crucial step in physical database design in row-oriented databases. A number of vertical partitioning algorithms have been proposed over the last three decades for a variety of niche scenarios. In principle, the underlying problem remains the same: decompose a table into one or more vertical partitions. However, it is not clear how good different vertical partitioning algorithms are in comparison to each other. In fact, it is not even clear how to experimentally compare different vertical partitioning algorithms. In this paper, we present an exhaustive experimental study of several vertical partitioning algorithms. We categorize vertical partitioning algorithms along three dimensions. We survey six vertical partitioning algorithms and discuss their pros and cons. We identify the major differences in the use-case settings for different algorithms and describe how to make an apples-to-apples comparison of different vertical partitioning algorithms under the same setting. We propose four metrics to compare vertical partitioning algorithms. We show experimental results from the TPC-H and SSB benchmark and present four key lessons learned: (1) we can do four orders of magnitude less computation and still find the optimal layouts, (2) the benefits of vertical partitioning depend strongly on the database buffer size, (3) HillClimb is the best vertical partitioning algorithm, and (4) vertical partitioning for TPC-H-like benchmarks can improve over column layout by only up to 5%.

References

  1. S. Agrawal, V. Narasayya, and B. Yang. Integrating Vertical and Horizontal Partitioning Into Automated Physical Database Design. In ACM SIGMOD, pages 359-370, 2004. Google ScholarGoogle Scholar
  2. Bonnie++, coker.com.au/bonnie++.Google ScholarGoogle Scholar
  3. W. W. Chu and I. T. Ieong. A Transaction-Based Approach to Vertical Partitioning for Relational Database Systems. IEEE Trans. Softw. Eng., 19(8):804-812, 1993. Google ScholarGoogle Scholar
  4. D. W. Cornell and P. S. Yu. A Vertical Partitioning Algorithm for Relational Databases. In ICDE, pages 30-35, 1987. Google ScholarGoogle Scholar
  5. D. W. Cornell and P. S. Yu. An Effective Approach to Vertical Partitioning for Physical Design of Relational Databases. IEEE Trans. Softw. Eng., 16(2):248-258, 1990. Google ScholarGoogle Scholar
  6. M. Grund, J. Krüger, H. Plattner, A. Zeier, P. Cudre-Mauroux, and S. Madden. HYRISE: A Main Memory Hybrid Storage Engine. PVLDB, 4(2):105-116, 2010. Google ScholarGoogle Scholar
  7. M. Hammer and B. Niamir. A Heuristic Approach to Attribute Partitioning. In ACM SIGMOD, pages 93-101, 1979. Google ScholarGoogle Scholar
  8. R. A. Hankins and J. M. Patel. Data Morphing: An Adaptive, Cache-Conscious Storage Technique. In VLDB, pages 417-428, 2003. Google ScholarGoogle Scholar
  9. HBase, hbase.apache.org.Google ScholarGoogle Scholar
  10. J. A. Hoffer and D. G. Severance. The Use of Cluster Analysis in Physical Data Base Design. In VLDB, pages 69-86, 1975. Google ScholarGoogle Scholar
  11. A. Jindal and J. Dittrich. Relax and let the database do the partitioning online. In BIRTE, pages 65-80, 2011.Google ScholarGoogle Scholar
  12. A. Jindal, J.-A. Quiané-Ruiz, and J. Dittrich. Trojan Data Layouts: Right Shoes for a Running Elephant. In ACM SOCC, pages 21:1-21:14, 2011. Google ScholarGoogle Scholar
  13. A. Jindal, F. M. Schuhknecht, J. Dittrich, K. Khachatryan, and A. Bunte. How Achaeans Would Construct Columns in Troy. In CIDR, 2013.Google ScholarGoogle Scholar
  14. W. T. M. Jr., P. J. Schweitzer, and T. W. White. Problem Decomposition and Data Reorganization by a Clustering Technique. Operations Research, 20(5):993-1009, 1972.Google ScholarGoogle Scholar
  15. S. Navathe, S. Ceri, G. Wiederhold, and J. Dou. Vertical Partitioning Algorithms for Database Design. ACM TODS, 9(4):680-710, 1984. Google ScholarGoogle Scholar
  16. S. B. Navathe and M. Ra. Vertical Partitioning for Database Design: A Graphical Algorithm. In ACM SIGMOD, pages 440-450, 1989. Google ScholarGoogle Scholar
  17. S. Papadomanolakis and A. Ailamaki. AutoPart: Automating Schema Design for Large Scientific Databases Using Data Partitioning. In SSDBM, pages 383-392, 2004. Google ScholarGoogle Scholar
  18. V. Raman, G. Swart, L. Qiao, F. Reiss, V. Dialani, D. Kossmann, I. Narang, and R. Sidle. Constant-Time Query Processing. In ICDE, pages 60-69, 2008. Google ScholarGoogle Scholar
  19. Star Schema Benchmark, www.cs.umb.edu/~poneil/StarSchemaB.PDF.Google ScholarGoogle Scholar
  20. Vertica, vertica.com.Google ScholarGoogle Scholar

Index Terms

  1. A comparison of knives for bread slicing
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image Proceedings of the VLDB Endowment
        Proceedings of the VLDB Endowment  Volume 6, Issue 6
        April 2013
        144 pages

        Publisher

        VLDB Endowment

        Publication History

        • Published: 1 April 2013
        Published in pvldb Volume 6, Issue 6

        Qualifiers

        • article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader