ABSTRACT
Advances in hardware have enabled many long-running applications to execute entirely in main memory. As a result, these applications have increasingly turned to database techniques to ensure durability in the event of a crash. However, many of these applications, such as massively multiplayer online games and main-memory OLTP systems, must sustain extremely high update rates - often hundreds of thousands of updates per second. Providing durability for these applications without introducing excessive overhead or latency spikes remains a challenge for application developers.
In this paper, we take advantage of frequent points of consistency in many of these applications to develop novel checkpoint recovery algorithms that trade additional space in main memory for significantly lower overhead and latency. Compared to previous work, our new algorithms do not require any locking or bulk copies of the application state. Our experimental evaluation shows that one of our new algorithms attains nearly constant latency and reduces overhead by more than an order of magnitude for low to medium update rates. Additionally, in a heavily loaded main-memory transaction processing system, it still reduces overhead by more than a factor of two.
- A. Ailamaki, D. DeWitt, M. Hill, and M. Skounakis. Weaving Relations for Cache Performance. In Proc. VLDB, 2001. Google ScholarDigital Library
- P. Alvaro, T. Condie, N. Conway, K. Elmeleegy, J. M. Hellerstein, and R. C. Sears. BOOM: Data-centric programming in the datacenter. Technical Report UCB/EECS-2009--113, EECS Department, University of California, Berkeley, 2009.Google Scholar
- J. Bartlett, J. Gray, and B. Horst. Fault tolerance in tandem computer systems. Technical Report 86.2, PN87616, Tandem Computers, 1986.Google Scholar
- A. Beguelin, E. Seligman, and P. Stephan. Application Level Fault Tolerance in Heterogeneous Networks of Workstations. Journal of Parallel and Distributed Computing, 43(2):147--155, 1997. Google ScholarDigital Library
- G. Bronevetsky, M. Schulz, P. Szwed, D. Marques, and K. Pingali. Application-level Checkpointing for Shared Memory Programs. In Proc. ASPLOS, 2004. Google ScholarDigital Library
- K. M. Chandy and L. Lamport. Distributed Snapshots: Determining Global States of Distributed Systems. ACM TOCS, 3(1):63--75, 1985. Google ScholarDigital Library
- R. Cortez. World Class Networking Infrastructure. In Proc. Austin GDC, 2007.Google Scholar
- D. J. DeWitt, R. Katz, F. Olken, L. Shapiro, M. Stonebraker, and D. Wood. Implementation Techniques for Main Memory Database Systems. In Proc. SIGMOD, 1984. Google ScholarDigital Library
- M. Elnozahy, L. Alvisi, Y.-M. Wang, and D. B. Johnson. A Survey of Rollback-Recovery Protocols in Message-Passing Systems. ACM Computing Surveys, 34(3):375--408, 2002. Google ScholarDigital Library
- H. F. Gujónsson. The Server Technology of EVE Online: How to Cope With 300,000 Players on One Server. In Proc. Austin GDC, 2008.Google Scholar
- N. Gupta, A. J. Demers, J. Gehrke, P. Unterbrunner, and W. M. White. Scalability for virtual worlds. In ICDE, 2009. Google ScholarDigital Library
- R. Hagmann. A Crash Recovery Scheme for a Memory-Resident Database System. IEEE Transactions on Computers, 35(9):839--843, 1986. Google ScholarDigital Library
- M. Herlihy. Wait-free Synchronization. ACM TOPLAS, 13(1):124--149, 1991. Google ScholarDigital Library
- S.-O. Hvasshovd, O. Torbjornsen, S. Bratsberg, and P. Holager. The ClustRa Telecom Database: High Availability, High Throughput, and Real-Time Response. In Proc. VLDB, 1995. Google ScholarDigital Library
- Intel VTune Performance Analyzer. http://software.intel.com/en-us/intel-vtune.Google Scholar
- Jason Gregory. Game Engine Architecture (Section 7.5). A K Peters, 2009.Google Scholar
- E. P. C. Jones, D. J. Abadi, and S. Madden. Low overhead concurrency control for partitioned main memory databases. In Proc. SIGMOD, 2010. Google ScholarDigital Library
- E. Lau and S. Madden. An Integrated Approach to Recovery and High Availability in an Updatable, Distributed Data Warehouse. In Proc. VLDB, 2006. Google ScholarDigital Library
- M. Litzkow, T. Tannenbaum, J. Basney, and M. Livny. Checkpoint and Migration of Unix Processes in the Condor Distributed Processing System. Technical Report 1346, University of Winsconsin-Madison, 1997.Google Scholar
- T. MacDonald. Solid-state Storage Not Just a Flash in the Pan. Storage Magazine, 2007. http://searchStorage.techtarget.com/magazineFeature/0,2%96894,sid5_gci1276095,00.html http://searchStorage.techtarget.com/magazineFeature/0,296894,sid5_gci1276%095,00.html.Google Scholar
- C. Mohan, D. Haderle, B. Lindsay, H. Pirahesh, and P. Schwarz. ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks Using Write-Ahead Logging. ACM TODS, 17(1):94--162, 1992. Google ScholarDigital Library
- J. S. Plank, M. Beck, G. Kingsley, and K. Li. Libckpt: Transparent Checkpointing under UNIX. In Proc. USENIX Winter Technical Conference, 1995. Google ScholarDigital Library
- C. Pu. On-the-Fly, Incremental, Consistent Reading of Entire Databases. Algorithmica, 1:271--287, 1986.Google ScholarCross Ref
- RamSan-400 Specifications. http://www.ramsan.com/products/ramsan-400.htm http://www.ramsan.com/products/ramsan-400.htm.Google Scholar
- D. Rosenkrantz. Dynamic Database Dumping. In Proc. SIGMOD, 1978. Google ScholarDigital Library
- K. Salem and H. Garcia-Molina. Checkpointing Memory-Resident Databases. In Proc. ICDE, 1989. Google ScholarDigital Library
- M. V. Salles, T. Cao, B. Sowell, A. Demers, J. Gehrke, C. Koch, and W. White. An evaluation of checkpoint recovery for massively multiplayer online games. In Proc. VLDB, 2009.Google ScholarDigital Library
- M. Stonebraker, S. Madden, D. Abadi, S. Harizopoulos, N. Hachem, and P. Helland. The End of an Architectural Era (It's Time for a Complete Rewrite). In Proc. VLDB, 2007. Google ScholarDigital Library
- T. Strohman and W. B. Croft. Efficient Document Retrieval in Main Memory. In Proc. SIGIR, 2007. Google ScholarDigital Library
- A. Thomson and D. Abadi. The case for determinism in database systems. In Proc. VLDB, 2010. Google ScholarDigital Library
- Transaction Processing Council. TPC Benchmark(TM) C, 2010. http://www.tpc.org/tpcc/spec/tpcc_current.pdf.Google Scholar
- P. Unterbrunner, G. Giannikis, G. Alonso, D. Fauser, and D. Kossmann. Predictable performance for unpredictable workloads. PVLDB, 2(1):706--717, 2009. Google ScholarDigital Library
- VoltDB. http://voltdb.com/product.Google Scholar
- G. Wang, M. V. Salles, B. Sowell, X. Wang, T. Cao, A. Demers, J. Gehrke, and W. White. Behavioral simulations in mapreduce. In Proc. VLDB, 2010. Google ScholarDigital Library
- W. White, A. Demers, C. Koch, J. Gehrke, and R. Rajagopalan. Scaling Games to Epic Proportions. In Proc. SIGMOD, 2007. Google ScholarDigital Library
- A. Whitney, D. Shasha, and S. Apter. High Volume Transaction Processing Without Concurrency Control, Two Phase Commit, SQL, or C+. In Proc. HPTS, 1997.Google Scholar
- G. Zheng, L. Shi, and L. V. Kale. FTC-Charm: an In-Memory Checkpoint-Based Fault Tolerant Runtime for Charm and MPI. In Proc. CLUSTER, 2004. Google ScholarDigital Library
Index Terms
- Fast checkpoint recovery algorithms for frequently consistent applications
Recommendations
BRRL: a recovery library for main-memory applications in the cloud
SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of dataIn this demonstration we present BRRL, a library for making distributed main-memory applications fault tolerant. BRRL is optimized for cloud applications with frequent points of consistency that use data-parallelism to avoid complex concurrency control ...
Fast Failure Recovery for Main-Memory DBMSs on Multicores
SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of DataMain-memory database management systems (DBMS) can achieve excellent performance when processing massive volume of on-line transactions on modern multi-core machines. But existing durability schemes, namely, tuple-level and transaction-level logging-and-...
An optimistic checkpointing and message logging approach for consistent global checkpoint collection in distributed systems
Checkpointing and rollback recovery are widely used techniques for achieving fault-tolerance in distributed systems. In this paper, we present a novel checkpointing algorithm which has the following desirable features: A process can independently ...
Comments