skip to main content
10.1145/2452376.2452402acmotherconferencesArticle/Chapter ViewAbstractPublication PagesedbtConference Proceedingsconference-collections
research-article

An automatic physical design tool for clustered column-stores

Published:18 March 2013Publication History

ABSTRACT

Good database design is typically a very difficult and costly process. As database systems get more complex and as the amount of data under management grows, the stakes increase accordingly. Past research produced a number of design tools capable of automatically selecting secondary indexes and materialized views for a known workload. However, a significant bulk of research on automated database design has been done in the context of row-store DBMSes. While this work has produced effective design tools, new specialized database architectures demand a rethinking of automated design algorithms.

In this paper, we present results for an automatic design tool that is aimed at column-oriented DBMSes on OLAP workloads. In particular, we have chosen a commercial column store DBMS that supports data sorting. In this setting, the key problem is selecting proper sort orders and compression schemes for the columns as well as appropriate pre-join views. This paper describes our automatic design algorithms as well as the results of some experiments using it on realistic data sets.

References

  1. Create indexes with included columns. http://msdn.microsoft.com/en-us/library/ms190806.aspx.Google ScholarGoogle Scholar
  2. Ibm ilog cplex optimizer. http://www-01.ibm.com/software/integration/optimization/cplex-optimizer/.Google ScholarGoogle Scholar
  3. InnoDB Engine. http://www.innodb.com/.Google ScholarGoogle Scholar
  4. Vertica. http://www.vertica.com/.Google ScholarGoogle Scholar
  5. D. J. Abadi, S. Madden, and M. Ferreira. Integrating compression and execution in column-oriented database systems. In SIGMOD Conference, pages 671--682, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Agrawal, S. Chaudhuri, L. Kollár, A. P. Marathe, V. R. Narasayya, and M. Syamala. Database tuning advisor for microsoft sql server 2005. In VLDB, pages 1110--1121, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  7. S. Agrawal, S. Chaudhuri, and V. R. Narasayya. Automated selection of materialized views and indexes in sql databases. In VLDB, pages 496--505, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. N. Bruno and S. Chaudhuri. Automatic physical database tuning: A relaxation-based approach. In SIGMOD Conference, pages 227--238, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Chaudhuri and U. Dayal. An Overview of Data Warehousing and OLAP Technology. SIGMOD Record, 26(1):65--74, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. S. Chaudhuri, A. K. Gupta, and V. R. Narasayya. Compressing sql workloads. In SIGMOD Conference, pages 488--499, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Chaudhuri and V. R. Narasayya. An efficient cost-driven index selection tool for microsoft sql server. In VLDB, pages 146--155, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Chaudhuri and V. R. Narasayya. Autoadmin 'what-if' index analysis utility. In SIGMOD Conference, pages 367--378, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Chaudhuri and V. R. Narasayya. Index merging. In ICDE, pages 296--303, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. H. Gupta. Selection of views to materialize in a data warehouse. In ICDT, pages 98--112, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Han and M. Kamber. Data Mining: Concepts and Techinques. Morgan Kaufmann Publishers, 2nd edition edition, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. V. Harinarayan, A. Rajaraman, and J. D. Ullman. Implementing data cubes efficiently. In SIGMOD Conference, pages 205--216, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Héman, M. Zukowski, N. J. Nes, L. Sidirourgos, and P. A. Boncz. Positional update handling in column stores. In SIGMOD Conference, pages 543--554, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. Idreos, M. Kersten, and S. Manegold. Database cracking. 2007.Google ScholarGoogle Scholar
  19. H. Kimura, G. Huo, A. Rasin, S. Madden, and S. B. Zdonik. CORADD: Correlation Aware Database Designer for Materialized Views and Indexes. In Proceedings of the 36th International Conference on Very Large Data Bases. VLDB Endowment, September 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Lamb, M. Fuller, R. Varadarajan, N. Tran, B. Vandier, L. Doshi, and C. Bear. The Vertica Analytic Database: C-Store 7 Years Later. CoRR, abs/1208.4173, 2012.Google ScholarGoogle Scholar
  21. P. O'Neil, E. O'Neil, and X. Chen. The Star Schema Benchmark (SSB). http://www.cs.umb.edu/~poneil/StarSchemaB.PDF.Google ScholarGoogle Scholar
  22. S. Papadomanolakis and A. Ailamaki. Autopart: Automating schema design for large scientific databases using data partitioning. In SSDBM, pages 383--392. IEEE Computer Society, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. Papadomanolakis and A. Ailamaki. An integer linear programming approach to database design. In ICDE Workshops, pages 442--449. IEEE Computer Society, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. Stonebraker, D. J. Abadi, A. Batkin, X. Chen, M. Cherniack, M. Ferreira, E. Lau, A. Lin, S. Madden, E. J. O'Neil, P. E. O'Neil, A. Rasin, N. Tran, and S. B. Zdonik. C-Store: A Column-oriented DBMS. In VLDB, pages 553--564, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. C. Yang, C. Yen, C. Tan, and S. R. Madden. Osprey: Implementing MapReduce-style fault tolerance in a shared-nothing distributed database. In 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010), pages 657--668. IEEE, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  26. D. C. Zilio, J. Rao, S. Lightstone, G. M. Lohman, A. J. Storm, C. Garcia-Arellano, and S. Fadden. DB2 design Advisor: Integrated Automatic Physical Database Design. In VLDB, pages 1087--1097, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J. Ziv and A. Lempel. A universal algorithm for sequential data compression. IEEE Transactions on Information Theory, 23(3):337--343, 1977. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. An automatic physical design tool for clustered column-stores

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          EDBT '13: Proceedings of the 16th International Conference on Extending Database Technology
          March 2013
          793 pages
          ISBN:9781450315975
          DOI:10.1145/2452376

          Copyright © 2013 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 18 March 2013

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate7of10submissions,70%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader