skip to main content
research-article

Aether: a scalable approach to logging

Authors Info & Claims
Published:01 September 2010Publication History
Skip Abstract Section

Abstract

The shift to multi-core hardware brings new challenges to database systems, as the software parallelism determines performance. Even though database systems traditionally accommodate simultaneous requests, a multitude of synchronization barriers serialize execution. Write-ahead logging is a fundamental, omnipresent component in ARIES-style concurrency and recovery, and one of the most important yet-to-be addressed potential bottlenecks, especially in OLTP workloads making frequent small changes to data.

In this paper, we identify four logging-related impediments to database system scalability. Each issue challenges different level in the software architecture: (a) the high volume of small-sized I/O requests may saturate the disk, (b) transactions hold locks while waiting for the log flush, (c) extensive context switching overwhelms the OS scheduler with threads executing log I/Os, and (d) contention appears as transactions serialize accesses to in-memory log data structures. We demonstrate these problems and address them with techniques that, when combined, comprise a holistic, scalable approach to logging. Our solution achieves a 20%-69% speedup over a modern database system when running log-intensive workloads, such as the TPC-B and TATP benchmarks. Moreover, it achieves log insert throughput over 1.8GB/s for small log records on a single socket server, an order of magnitude higher than the traditional way of accessing the log using a single mutex.

References

  1. L. Bouganim, B. Jonsson, and P. Bonnet. "uFlip: Understanding Flash I/O Patterns." In Proc. CIDR, 2009.Google ScholarGoogle Scholar
  2. M. Carey, et al. "Shoring up persistent applications." In Proc. SIGMOD, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Chen. "FlashLogging: exploiting flash devices for synchronous logging performance." In Proc. SIGMOD, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. DeWitt, et al. "Implementation Techniques for Main Memory Database Systems." ACM TODS, 14(2), 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. Gawlick and D. Kinkade. "Varieties of Concurrency Control in IMS/VS Fast Path." IEEE Database Eng. Bull. 1985.Google ScholarGoogle Scholar
  6. N. Hardavellas, et al. "Database servers on chip multiprocessors: limitations and opportunities." In Proc. CIDR, 2007.Google ScholarGoogle Scholar
  7. S. Harizopoulos, D. J. Abadi,. S. Madden, and M. Stonebraker. "OLTP through the looking glass, and what we found there." In Proc. SIGMOD, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. P. Helland, H. Sammer, J. Lyon, R. Carr, and P. Garrett. "Group Commit Timers and High-Volume Transaction Systems." In Proc. HPTS, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. D. Hendler, N. Shavit, and L. Yerushalmi. "A Scalable Lock-free Stack Algorithm." In Proc. SPAA, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. R. Johnson, I. Pandis, and A. Ailamaki. "Improving OLTP Scalability using Speculative Lock Inheritance." In Proc. VLDB, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. R. Johnson, I. Pandis, N. Hardavellas, A. Ailamaki, and B. Falsafi. "Shore-MT: a scalable storage manager for the multi-core era." In Proc. EDBT, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S.-W. Lee, B. Moon, J.-M. Kim, and S.-W. Kim. "A Case for Flash Memory SSD in Enterprise Database Applications." In Proc. SIGMOD, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. C. Mohan. "ARIES/KVL: A key-value locking method for concurrency control of multiaction transactions operating on B-tree indexes." In Proc. VLDB, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. C. Mohan, D. Haderle, B. Lindsay, H. Pirahesh, and P. Schwarz. "ARIES: A transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging." ACM TODS, 17(1), 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Moir, D. Nussbaum, O. Shalev, and N. Shavit. "Using Elimination to Implement Scalable FIFO Queues." In Proc. SPAA, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Oracle. Oracle Asynchronous Commit. Oracle Database Advanced Application Developer's Guide. Available at: http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14251/adfns_sqlproc.htm.Google ScholarGoogle Scholar
  17. PostgreSQL Asynchronous Commit. PostgreSQL 8.4.2 Documentation. Available at: http://www.postgresql.org/files/documentation/pdf/8.4/postgresql-8.4.2-A4.pdf.Google ScholarGoogle Scholar
  18. A. Rafii, and D. DuBois. "Performance Tradeoffs of Group Commit Logging." In Proc. CMG Conference, 1989.Google ScholarGoogle Scholar
  19. N. Shavit and D. Touitou. "Elimination Trees and the Construction of Pools and Stacks." In Theory of Computing Systems, 30(6), pp 645--670, 1997.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. L. Scott. "Non-Blocking Timeout in Scalable Queue-Based Spin Locks." In Proc. PODC, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. E. Soisalon-Soininen, and T. Ylonen. "Partial Strictness in Two-Phase Locking." In Proc. ICDT, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. Stonebraker, et al. "The end of an Architectural Era (It's Time for a Complete Rewrite)." In Proc. VLDB, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Telecom Application Transaction Processing Benchmark (TATP). TATP Benchmark Description. Available at: http://tatpbenchmark.sourceforge.net/TATP_Description.pdf.Google ScholarGoogle Scholar
  24. Transaction Processing Performance Council (TPC). TPC Benchmark B: Standard Specification. Available at http://www.tpc.org/tpcb/spec/tpcb_current.pdf.Google ScholarGoogle Scholar
  25. {A1} P. Helland. "Life Beyond Distributed Transactions: an Apostate's Opinion." In Proc. CIDR, 2007.Google ScholarGoogle Scholar
  26. {A2} T. Lahiri, V. Srihari, W. Chan, and N. MacNaughton. "Cache Fusion: Extending shared-disk clusters with shared caches." In Proc. VLDB, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. {A3} D. Lomet. "Recovery for Shared Disk Systems Using Multiple Redo Logs." CRL 90/4, 1990.Google ScholarGoogle Scholar
  28. {A4} D. Lomet, R. Anderson, T. K. Rengarajan, and P. Spiro. "How the Rdb/VMS Data Sharing System Became Fast." CRL 92/4, 1992.Google ScholarGoogle Scholar
  29. {A5} Y. Oyama, K. Taura, and A. Yonezawa. "Executing Parallel Programs with Synchronization Bottlenecks Efficiently." In Proc. PDSIA, 1999, pp. 182--204.Google ScholarGoogle Scholar
  30. {A6} Transaction Processing Performance Council. "TPC - C v5.5: On-Line Transaction Processing (OLTP) Benchmark."Google ScholarGoogle Scholar

Index Terms

  1. Aether: a scalable approach to logging
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image Proceedings of the VLDB Endowment
        Proceedings of the VLDB Endowment  Volume 3, Issue 1-2
        September 2010
        1658 pages

        Publisher

        VLDB Endowment

        Publication History

        • Published: 1 September 2010
        Published in pvldb Volume 3, Issue 1-2

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader