skip to main content
column

The BigDAWG Polystore System

Published:12 August 2015Publication History
Skip Abstract Section

Abstract

This paper presents a new view of federated databases to address the growing need for managing information that spans multiple data models. This trend is fueled by the proliferation of storage engines and query languages based on the observation that 'no one size fits all'. To address this shift, we propose a polystore architecture; it is designed to unify querying over multiple data models. We consider the challenges and opportunities associated with polystores. Open questions in this space revolve around query optimization and the assignment of objects to storage engines. We introduce our approach to these topics and discuss our prototype in the context of the Intel Science and Technology Center for Big Data

References

  1. Accumulo. https://accumulo.apache.org/.Google ScholarGoogle Scholar
  2. L. Amsaleg, A. Tomasic, M. J. Franklin, and T. Urhan. Scrambling query plans to cope with unexpected delays. In Fourth International Conference on Parallel and Distributed Information Systems, 1996, pages 208--219. IEEE, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom. Models and issues in data stream systems. In PODS, pages 1--16. ACM, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. C. Batini, M. Lenzerini, and S. B. Navathe. A comparative analysis of methodologies for database schema integration. ACM Computing Surveys, 18(4):323--364, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. L. Bouganim, F. Fabret, C. Mohan, and P. Valduriez. A dynamic query processing architecture for data integration systems. IEEE Data Eng. Bull., 23(2):42--48, 2000.Google ScholarGoogle Scholar
  6. P. G. Brown. Overview of scidb: large scale array storage, processing and analysis. In SIGMOD, pages 963--968. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. J. Carey, L. M. Haas, P. M. Schwarz, M. Arya, W. F. Cody, R. Fagin, M. Flickner, A. W. Luniewski,W. Niblack, and D. Petkovic. Towards heterogeneous multimedia information systems: The Garlic approach. In Data Engineering: Distributed Object Management, pages 124--131. IEEE, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. U. Cetintemel, J. Du, T. Kraska, S. Madden, D. Maier, J. Meehan, A. Pavlo, M. Stonebraker, E. Sutherland, and N. Tatbul. S-Store: A Streaming NewSQL System for Big Velocity Applications. PVLDB, 7(13), 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Chawathe, H. G. Molina, J. Hammer, K. Ireland, Y. Papakonstantinou, J. Ullman, and J. Widom. The TSIMMIS Project: Integration of Heterogeneous Information Sources. In IPSJ, 1994.Google ScholarGoogle Scholar
  10. A. Deshpande and J. M. Hellerstein. Decoupled query optimization for federated database systems. In ICDE, pages 716--727. IEEE, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. J. DeWitt, A. Halverson, R. Nehme, S. Shankar, J. Aguilar-Saborit, A. Avanes, M. Flasza, and J. Gramling. Split query processing in polybase. SIGMOD, pages 1255--1266, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. Franklin, A. Halevy, and D. Maier. From databases to dataspaces: a new abstraction for information management. Sigmod Record, 34(4):27--33, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D. Halperin, V. Teixeira de Almeida, L. L. Choo, S. Chu, P. Koutris, D. Moritz, J. Ortiz, V. Ruamviboonsuk, J. Wang, A. Whitaker, et al. Demonstration of the Myria big data management service. In SIGMOD. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Hull. Managing semantic heterogeneity in databases: a theoretical prospective. In PODS, pages 51--61. ACM, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Kepner, W. Arcand, W. Bergeron, N. Bliss, R. Bond, C. Byun, G. Condon, K. Gregson, M. Hubbell, and J. Kurz. Dynamic distributed dimensional data model (d4m) database and computation system. In ICASSP. IEEE, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  16. J. LeFevre, J. Sankaranarayanan, H. Hacigümüs, J. Tatemura, N. Polyzotis, and M. J. Carey. MISO: souping up big data query processing with a multistore system. In SIGMOD, pages 1591--1602, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. L. M. Mackinnon, D. H. Marwick, and M. H. Williams. A model for query decomposition and answer construction in heterogeneous distributed database systems. Journal of Intelligent Information Systems, 11(1):69--87, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Saeed, M. Villarroel, A. T. Reisner, G. Clifford, L.-W. Lehman, G. Moody, T. Heldt, T. H. Kyaw, B. Moody, and R. G. Mark. Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC-II): A public-access intensive care unit database. Critical Care Medicine, 39:952--960, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  19. P. G. Selinger, M. M. Astrahan, D. D. Chamberlin, R. A. Lorie, and T. G. Price. Access path selection in a relational database management system. In SIGMOD, pages 23--34. ACM, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Stonebraker, P. M. Aoki, W. Litwin, A. Pfeffer, A. Sah, J. Sidell, C. Staelin, and A. Yu. Mariposa: a wide-area distributed database system. In The VLDB Journal, volume 5, pages 48--63. Springer, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. Stonebraker and U. Cetintemel. ¿One Size Fits All': An Idea Whose time has come and gone. In ICDE, pages 2--11, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. Taft, M. Vartak, N. R. Satish, N. Sundaram, S. Madden, and M. Stonebraker. Genbase: A complex analytics genomics benchmark. In SIGMOD, pages 177--188. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. G. Wiederhold. Mediators in the architecture of future information systems. Computer, pages 38--49, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. The BigDAWG Polystore System
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGMOD Record
      ACM SIGMOD Record  Volume 44, Issue 2
      June 2015
      56 pages
      ISSN:0163-5808
      DOI:10.1145/2814710
      Issue’s Table of Contents

      Copyright © 2015 Copyright is held by the owner/author(s)

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 August 2015

      Check for updates

      Qualifiers

      • column

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader