skip to main content
research-article

Write-behind logging

Published:01 November 2016Publication History
Skip Abstract Section

Abstract

The design of the logging and recovery components of database management systems (DBMSs) has always been influenced by the difference in the performance characteristics of volatile (DRAM) and non-volatile storage devices (HDD/SSDs). The key assumption has been that non-volatile storage is much slower than DRAM and only supports block-oriented read/writes. But the arrival of new non-volatile memory (NVM) storage that is almost as fast as DRAM with fine-grained read/writes invalidates these previous design choices.

This paper explores the changes that are required in a DBMS to leverage the unique properties of NVM in systems that still include volatile DRAM. We make the case for a new logging and recovery protocol, called write-behind logging, that enables a DBMS to recover nearly instantaneously from system failures. The key idea is that the DBMS logs what parts of the database have changed rather than how it was changed. Using this method, the DBMS flushes the changes to the database <u>before</u> recording them in the log. Our evaluation shows that this protocol improves a DBMS's transactional throughput by 1.3×, reduces the recovery time by more than two orders of magnitude, and shrinks the storage footprint of the DBMS on NVM by 1.5×. We also demonstrate that our logging protocol is compatible with standard replication schemes.

References

  1. NUMA policy library. http://linux.die.net/man/3/numa.Google ScholarGoogle Scholar
  2. Peloton Database Management System. http://pelotondb.org.Google ScholarGoogle Scholar
  3. Persistent memory programming library. http://pmem.io/.Google ScholarGoogle Scholar
  4. Intel Architecture Instruction Set Extensions Programming Reference. https://software.intel.com/sites/default/files/managed/b4/3a/319433-024.pdf, 2016.Google ScholarGoogle Scholar
  5. R. Agrawal and H. V. Jagadish. Recovery algorithms for database machines with nonvolatile main memory. IWDM'89, pages 269--285. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Arulraj, A. Pavlo, and S. Dulloor. Let's talk about storage & recovery methods for non-volatile memory database systems. In SIGMOD'15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Arulraj, A. Pavlo, and P. Menon. Bridging the archipelago between row-stores and column-stores for hybrid workloads. In SIGMOD'16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Axboe. Flexible io tester. http://freecode.com/projects/fio.Google ScholarGoogle Scholar
  9. G. W. Burr, B. N. Kurdi, J. C. Scott, C. H. Lam, K. Gopalakrishnan, and R. S. Shenoy. Overview of candidate device technologies for storage-class memory. IBM J. Res. Dev., 52(4):449--464, July 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Chatzistergiou, M. Cintra, and S. D. Viglas. REWIND: Recovery write-ahead system for in-memory non-volatile data-structures. PVLDB, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. Chen and Q. Jin. Persistent b+-trees in non-volatile main memory. Proc. VLDB Endow., 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Condit, E. B. Nightingale, C. Frost, E. Ipek, B. Lee, D. Burger, and D. Coetzee. Better I/O through byte-addressable, persistent memory. In SOSP, pages 133--146, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears. Benchmarking cloud serving systems with YCSB. In SoCC, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. DeBrabant, J. Arulraj, A. Pavlo, M. Stonebraker, S. Zdonik, and S. Dulloor. A prolegomenon on OLTP database systems for non-volatile memory. In ADMS@VLDB, 2014.Google ScholarGoogle Scholar
  15. D. J. DeWitt, R. H. Katz, F. Olken, L. D. Shapiro, M. R. Stonebraker, and D. Wood. Implementation techniques for main memory database systems. SIGMOD Rec., 14(2):1--8, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. Diaconu, C. Freedman, E. Ismert, P.-A. Larson, P. Mittal, R. Stonecipher, N. Verma, and M. Zwilling. Hekaton: SQL Server's Memory-optimized OLTP Engine. In SIGMOD, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. R. Dulloor, S. K. Kumar, A. Keshavamurthy, P. Lantz, D. Subbareddy, R. Sankaran, and J. Jackson. System software for persistent memory. In EuroSys, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. R. Fang, H.-I. Hsiao, B. He, C. Mohan, and Y. Wang. High performance database logging using storage class memory. ICDE, pages 1221--1231, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Franklin. Concurrency Control and Recovery. The Computer Science and Engineering Handbook, pages 1058--1077, 1997.Google ScholarGoogle Scholar
  20. G. Graefe, W. Guy, and C. Sauer. Instant recovery with write-ahead logging: Page repair, system restart, and media restore. Synthesis Lectures on Data Management, 2015.Google ScholarGoogle Scholar
  21. T. Härder, C. Sauer, G. Graefe, and W. Guy. Instant recovery with write-ahead logging. Datenbank-Spektrum, pages 235--239, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  22. J. Huang, K. Schwan, and M. K. Qureshi. Nvram-aware logging in transaction systems. Proc. VLDB Endow., pages 389--400, Dec. 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. R. Johnson, I. Pandis, N. Hardavellas, A. Ailamaki, and B. Falsafi. Shore-MT: a scalable storage manager for the multicore era. In EDBT, pages 24--35, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. H. Kim, S. Seshadri, C. L. Dickey, and L. Chiu. Evaluating phase change memory for enterprise storage systems: A study of caching and tiering approaches. In FAST, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. H. Kimura. FOEDUS: OLTP engine for a thousand cores and NVRAM. In SIGMOD, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. P.-A. Larson, S. Blanas, C. Diaconu, C. Freedman, J. M. Patel, and M. Zwilling. High-performance concurrency control mechanisms for main-memory databases. Proc. VLDB Endow., 5(4):298--309, Dec. 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. D. E. Lowell and P. M. Chen. Free transactions with rio vista. In SOSP, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. C. Mohan, D. Haderle, B. Lindsay, H. Pirahesh, and P. Schwarz. ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. ACM Trans. Database Syst., 17(1):94--162, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. T. Neumann, T. Mühlbauer, and A. Kemper. Fast Serializable Multi-Version Concurrency Control for Main-Memory Database Systems. In SIGMOD, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. NVM Express Inc. NVM Express over Fabrics specification. http://www.nvmexpress.org/specifications, 2016.Google ScholarGoogle Scholar
  31. G. Oh, S. Kim, S.-W. Lee, and B. Moon. Sqlite optimization with phase change memory for mobile applications. Proc. VLDB Endow., 8(12):1454--1465, Aug. 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. I. Oukid, D. Booss, W. Lehner, P. Bumbulis, and T. Willhalm. SOFORT: A hybrid SCM-DRAM storage engine for fast data recovery. DaMoN, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. S. Pelley, T. F. Wenisch, B. T. Gold, and B. Bridge. Storage management in the NVRAM era. PVLDB, 7(2):121--132, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. S. Pilarski and T. Kameda. Checkpointing for distributed databases: Starting from the basics. IEEE Trans. Parallel Distrib. Syst., 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. The Transaction Processing Council. TPC-C Benchmark (Revision 5.9.0). http://www.tpc.org/tpcc/, June 2007.Google ScholarGoogle Scholar
  36. T. Wang and R. Johnson. Scalable logging through emerging non-volatile memory. PVLDB, 7(10):865--876, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Y. Zhang, J. Yang, A. Memaripour, and S. Swanson. Mojim: A reliable and highly-available non-volatile memory system. In ASPLOS, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. W. Zheng, S. Tu, E. Kohler, and B. Liskov. Fast databases with fast durability and recovery through multicore parallelism. In OSDI, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 10, Issue 4
    November 2016
    180 pages
    ISSN:2150-8097
    Issue’s Table of Contents

    Publisher

    VLDB Endowment

    Publication History

    • Published: 1 November 2016
    Published in pvldb Volume 10, Issue 4

    Qualifiers

    • research-article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader