skip to main content
10.1145/1989323.1989352acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Fast checkpoint recovery algorithms for frequently consistent applications

Published:12 June 2011Publication History

ABSTRACT

Advances in hardware have enabled many long-running applications to execute entirely in main memory. As a result, these applications have increasingly turned to database techniques to ensure durability in the event of a crash. However, many of these applications, such as massively multiplayer online games and main-memory OLTP systems, must sustain extremely high update rates - often hundreds of thousands of updates per second. Providing durability for these applications without introducing excessive overhead or latency spikes remains a challenge for application developers.

In this paper, we take advantage of frequent points of consistency in many of these applications to develop novel checkpoint recovery algorithms that trade additional space in main memory for significantly lower overhead and latency. Compared to previous work, our new algorithms do not require any locking or bulk copies of the application state. Our experimental evaluation shows that one of our new algorithms attains nearly constant latency and reduces overhead by more than an order of magnitude for low to medium update rates. Additionally, in a heavily loaded main-memory transaction processing system, it still reduces overhead by more than a factor of two.

References

  1. A. Ailamaki, D. DeWitt, M. Hill, and M. Skounakis. Weaving Relations for Cache Performance. In Proc. VLDB, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. P. Alvaro, T. Condie, N. Conway, K. Elmeleegy, J. M. Hellerstein, and R. C. Sears. BOOM: Data-centric programming in the datacenter. Technical Report UCB/EECS-2009--113, EECS Department, University of California, Berkeley, 2009.Google ScholarGoogle Scholar
  3. J. Bartlett, J. Gray, and B. Horst. Fault tolerance in tandem computer systems. Technical Report 86.2, PN87616, Tandem Computers, 1986.Google ScholarGoogle Scholar
  4. A. Beguelin, E. Seligman, and P. Stephan. Application Level Fault Tolerance in Heterogeneous Networks of Workstations. Journal of Parallel and Distributed Computing, 43(2):147--155, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. G. Bronevetsky, M. Schulz, P. Szwed, D. Marques, and K. Pingali. Application-level Checkpointing for Shared Memory Programs. In Proc. ASPLOS, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. K. M. Chandy and L. Lamport. Distributed Snapshots: Determining Global States of Distributed Systems. ACM TOCS, 3(1):63--75, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. R. Cortez. World Class Networking Infrastructure. In Proc. Austin GDC, 2007.Google ScholarGoogle Scholar
  8. D. J. DeWitt, R. Katz, F. Olken, L. Shapiro, M. Stonebraker, and D. Wood. Implementation Techniques for Main Memory Database Systems. In Proc. SIGMOD, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Elnozahy, L. Alvisi, Y.-M. Wang, and D. B. Johnson. A Survey of Rollback-Recovery Protocols in Message-Passing Systems. ACM Computing Surveys, 34(3):375--408, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. H. F. Gujónsson. The Server Technology of EVE Online: How to Cope With 300,000 Players on One Server. In Proc. Austin GDC, 2008.Google ScholarGoogle Scholar
  11. N. Gupta, A. J. Demers, J. Gehrke, P. Unterbrunner, and W. M. White. Scalability for virtual worlds. In ICDE, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. R. Hagmann. A Crash Recovery Scheme for a Memory-Resident Database System. IEEE Transactions on Computers, 35(9):839--843, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Herlihy. Wait-free Synchronization. ACM TOPLAS, 13(1):124--149, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. S.-O. Hvasshovd, O. Torbjornsen, S. Bratsberg, and P. Holager. The ClustRa Telecom Database: High Availability, High Throughput, and Real-Time Response. In Proc. VLDB, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Intel VTune Performance Analyzer. http://software.intel.com/en-us/intel-vtune.Google ScholarGoogle Scholar
  16. Jason Gregory. Game Engine Architecture (Section 7.5). A K Peters, 2009.Google ScholarGoogle Scholar
  17. E. P. C. Jones, D. J. Abadi, and S. Madden. Low overhead concurrency control for partitioned main memory databases. In Proc. SIGMOD, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. E. Lau and S. Madden. An Integrated Approach to Recovery and High Availability in an Updatable, Distributed Data Warehouse. In Proc. VLDB, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Litzkow, T. Tannenbaum, J. Basney, and M. Livny. Checkpoint and Migration of Unix Processes in the Condor Distributed Processing System. Technical Report 1346, University of Winsconsin-Madison, 1997.Google ScholarGoogle Scholar
  20. T. MacDonald. Solid-state Storage Not Just a Flash in the Pan. Storage Magazine, 2007. http://searchStorage.techtarget.com/magazineFeature/0,2%96894,sid5_gci1276095,00.html http://searchStorage.techtarget.com/magazineFeature/0,296894,sid5_gci1276%095,00.html.Google ScholarGoogle Scholar
  21. C. Mohan, D. Haderle, B. Lindsay, H. Pirahesh, and P. Schwarz. ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks Using Write-Ahead Logging. ACM TODS, 17(1):94--162, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. S. Plank, M. Beck, G. Kingsley, and K. Li. Libckpt: Transparent Checkpointing under UNIX. In Proc. USENIX Winter Technical Conference, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. C. Pu. On-the-Fly, Incremental, Consistent Reading of Entire Databases. Algorithmica, 1:271--287, 1986.Google ScholarGoogle ScholarCross RefCross Ref
  24. RamSan-400 Specifications. http://www.ramsan.com/products/ramsan-400.htm http://www.ramsan.com/products/ramsan-400.htm.Google ScholarGoogle Scholar
  25. D. Rosenkrantz. Dynamic Database Dumping. In Proc. SIGMOD, 1978. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. K. Salem and H. Garcia-Molina. Checkpointing Memory-Resident Databases. In Proc. ICDE, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. V. Salles, T. Cao, B. Sowell, A. Demers, J. Gehrke, C. Koch, and W. White. An evaluation of checkpoint recovery for massively multiplayer online games. In Proc. VLDB, 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. M. Stonebraker, S. Madden, D. Abadi, S. Harizopoulos, N. Hachem, and P. Helland. The End of an Architectural Era (It's Time for a Complete Rewrite). In Proc. VLDB, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. T. Strohman and W. B. Croft. Efficient Document Retrieval in Main Memory. In Proc. SIGIR, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. A. Thomson and D. Abadi. The case for determinism in database systems. In Proc. VLDB, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Transaction Processing Council. TPC Benchmark(TM) C, 2010. http://www.tpc.org/tpcc/spec/tpcc_current.pdf.Google ScholarGoogle Scholar
  32. P. Unterbrunner, G. Giannikis, G. Alonso, D. Fauser, and D. Kossmann. Predictable performance for unpredictable workloads. PVLDB, 2(1):706--717, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. VoltDB. http://voltdb.com/product.Google ScholarGoogle Scholar
  34. G. Wang, M. V. Salles, B. Sowell, X. Wang, T. Cao, A. Demers, J. Gehrke, and W. White. Behavioral simulations in mapreduce. In Proc. VLDB, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. W. White, A. Demers, C. Koch, J. Gehrke, and R. Rajagopalan. Scaling Games to Epic Proportions. In Proc. SIGMOD, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. A. Whitney, D. Shasha, and S. Apter. High Volume Transaction Processing Without Concurrency Control, Two Phase Commit, SQL, or C+. In Proc. HPTS, 1997.Google ScholarGoogle Scholar
  37. G. Zheng, L. Shi, and L. V. Kale. FTC-Charm: an In-Memory Checkpoint-Based Fault Tolerant Runtime for Charm and MPI. In Proc. CLUSTER, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Fast checkpoint recovery algorithms for frequently consistent applications

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
        June 2011
        1364 pages
        ISBN:9781450306614
        DOI:10.1145/1989323

        Copyright © 2011 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 12 June 2011

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate785of4,003submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader