skip to main content
10.1145/3318464.3380580acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

RID: Deduplicating Snapshot Computations

Published:31 May 2020Publication History

ABSTRACT

One can audit SQL applications by running SQL programs over sequences of persistent snapshots, but care is needed to avoid wasteful duplicate computation. This paper describes the design, implementation, and performance of RID, the first language-independent optimization framework that eliminates duplicate computations in SQL programs running over low-level snapshots by exploiting snapshot metadata efficiently.

Skip Supplemental Material Section

Supplemental Material

3318464.3380580.mp4

mp4

153.3 MB

References

  1. Michael H. Bö hlen and Christian S. Jensen. 2018. Sequenced Semantics. In Encyclopedia of Database Systems, Second Edition. https://doi.org/10.1007/978--1--4614--8265--9_1053Google ScholarGoogle Scholar
  2. Ramesh Chandra, Taesoo Kim, Meelap Shah, Neha Narula, and Nickolai Zeldovich. 2011. Intrusion recovery for database-backed web applications. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles. ACM, 101--114.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Transaction Processing Performance Council. 2010. TPC-H: Decision Support Benchmark. http://www.tpc.org/tpchGoogle ScholarGoogle Scholar
  4. Anton Dignös, Michael H Böhlen, Johann Gamper, and Christian S Jensen. 2016. Extending the kernel of a relational DBMS with comprehensive support for sequenced temporal queries. ACM Transactions on Database Systems (TODS), Vol. 41, 4 (2016), 26.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Anton Dignös, Boris Glavic, Xing Niu, Michael Böhlen, and Johann Gamper. 2019. Snapshot Semantics for Temporal Multiset Relations. The VLDB Journal - The International Journal on Very Large Data Bases, Vol. 12, 6 (2019), 639 -- 652.Google ScholarGoogle Scholar
  6. Pradeep Kumar Gunda, Lenin Ravindranath, Chandramohan A Thekkath, Yuan Yu, and Li Zhuang. 2010. Nectar: Automatic Management of Data and Computation in Datacenters.. In OSDI, Vol. 10. 1--8.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Intel. 2010. Oracle Berkeley DB SQL API vs. SQLite API - Integration, Benefits and Differences. (October 2010).Google ScholarGoogle Scholar
  8. David Lomet, Roger Barga, Mohamed F Mokbel, and German Shegalov. 2006. Transaction time support inside a database engine. In 22nd International Conference on Data Engineering (ICDE'06). IEEE, 35--35.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. David Lomet, Roger Barga, Mohamed F Mokbel, German Shegalov, Rui Wang, and Yunyue Zhu. 2005. Immortal DB: transaction time support for SQL server. In Proceedings of the 2005 ACM SIGMOD international conference on Management of data. ACM, 939--941.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. David Lomet and Betty Salzberg. 1989. Access methods for multiversion data. Vol. 18. ACM.Google ScholarGoogle Scholar
  11. Donald Michie. 1968. Memo functions and machine learning. Nature, Vol. 218, 5136 (1968), 19.Google ScholarGoogle Scholar
  12. Soumyadeb Mitra, Marianne Winslett, Richard T. Snodgrass, Shashank Yaduvanshi, and Sumedh Ambokar. 2009. An Architecture for Regulatory Compliant Database Management. In Proceedings of the 25th International Conference on Data Engineering, ICDE 2009, March 29 2009 - April 2 2009, Shanghai, China. 162--173. https://doi.org/10.1109/ICDE.2009.69Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Olson, M. A., Bostic, K. and Seltzer, M. I. 1999. Berkeley DB. In Proceedings of USENIX Annual Technical Conference, FREENIX Track. Monterey, CA, USA .Google ScholarGoogle Scholar
  14. Gultekin Ozsoyoglu and Richard T Snodgrass. 1995. Temporal and real-time databases: A survey. IEEE Transactions on Knowledge and Data Engineering, Vol. 7, 4 (1995), 513--532.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Betty Salzberg and Vassilis J Tsotras. 1999. Comparison of access methods for time-evolving data. ACM Computing Surveys (CSUR), Vol. 31, 2 (1999), 158--221.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Ross Shaull. 2013. A methodology for Retrospection Everywhere. Ph.D. Dissertation. Brandeis University.Google ScholarGoogle Scholar
  17. Ross Shaull, Liuba Shrira, and Barbara Liskov. 2014. A Modular and Efficient Past State System for Berkeley DB. In USENIX Annual Technical Conference. 157--168.Google ScholarGoogle Scholar
  18. Ross Shaull, Liuba Shrira, and Hao Xu. 2008. Skippy: a new snapshot indexing method for time travel in the storage manager. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data. ACM, 637--648.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Cheng Tan, Lingfan Yu, Joshua B. Leners, and Michael Walfish. 2017. The Efficient Server Audit Problem, Deduplicated Re-execution, and the Web. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP '17). 546--564.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Abdullah Uz Tansel, James Clifford, Shashi Gadia, Sushil Jajodia, Arie Segev, and Richard Snodgrass. 1993. Temporal databases: theory, design, and implementation. Benjamin-Cummings Publishing Co., Inc.Google ScholarGoogle Scholar
  21. Nikos Tsikoudis. 2020. Retrospective Computations over Sets of Snapshots: Design, Implementation and Optimization. Ph.D. Dissertation. Brandeis University.Google ScholarGoogle Scholar
  22. Nikos Tsikoudis, Liuba Shrira, and Sara Cohen. 2018. RQL: Retrospective Computations over Snapshot Sets. In Proceedings of the 21th International Conference on Extending Database Technology, EDBT. 600--611.Google ScholarGoogle Scholar

Index Terms

  1. RID: Deduplicating Snapshot Computations

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGMOD '20: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data
        June 2020
        2925 pages
        ISBN:9781450367356
        DOI:10.1145/3318464

        Copyright © 2020 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 31 May 2020

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate785of4,003submissions,20%
      • Article Metrics

        • Downloads (Last 12 months)9
        • Downloads (Last 6 weeks)1

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader