ABSTRACT
By maintaining the data in main memory, in-memory databases dramatically reduce the I/O cost of transaction processing. However, for recovery purposes, in-memory systems still need to flush the log to disk, which incurs a substantial number of I/Os. Recently, command logging has been proposed to replace the traditional data log (e.g., ARIES logging) in in-memory databases. Instead of recording how the tuples are updated, command logging only tracks the transactions that are being executed, thereby effectively reducing the size of the log and improving the performance. However, when a failure occurs, all the transactions in the log after the last checkpoint must be redone sequentially and this significantly increases the cost of recovery. In this paper, we first extend the command logging technique to a distributed system, where all the nodes can perform their recovery in parallel. We show that in a distributed system, the only bottleneck of recovery caused by command logging is the synchronization process that attempts to resolve the data dependency among the transactions. We then propose an adaptive logging approach by combining data logging and command logging. The percentage of data logging versus command logging becomes a tuning knob between the performance of transaction processing and recovery to meet different OLTP requirements, and a model is proposed to guide such tuning. Our experimental study compares the performance of our proposed adaptive logging, ARIES-style data logging and command logging on top of H-Store. The results show that adaptive logging can achieve a 10x boost for recovery and a transaction throughput that is comparable to that of command logging.
- MemSQL.burlhttp://www.memsql.com.Google Scholar
- Postgresql 8.3.23 documentation,chapter 28. reliability and the write-ahead log. http://www.postgresql.org/docs/8.3/static/wal-async-commit.html. Accessed: 2015--6-06.Google Scholar
- SAP HANA Wrings Performance from New Intel Xeons. http://www.enterprisetech.com/2014/02/19/sap-hana-wrings-performance-new-intel-xeons/.Google Scholar
- J. Arulraj, A. Pavlo, and S. R. Dulloor. Let's talk about storage & recovery methods for non-volatile memory database systems. In SIGMOD, pages 707--722. ACM, 2015. Google ScholarDigital Library
- P. Bailis, A. Ghodsi, J. M. Hellerstein, and I. Stoica. Bolt-on causal consistency. In SIGMOD, pages 761--772, 2013. Google ScholarDigital Library
- M. J. Cahill, U. Röhm, and A. D. Fekete. Serializable isolation for snapshot databases. In TODS, 34(4):20, 2009. Google ScholarDigital Library
- T. Cao, M. A. V. Salles, B. Sowell, Y. Yue, A. J. Demers, J. Gehrke, and W. M. White. Fast checkpoint recovery algorithms for frequently consistent applications. In SIGMOD, pages 265--276, 2011. Google ScholarDigital Library
- D. J. DeWitt, R. H. Katz, F. Olken, L. D. Shapiro, M. Stonebraker, and D. A. Wood. Implementation techniques for main memory database systems. In SIGMOD, pages 1--8, 1984. Google ScholarDigital Library
- M. H. Eich. Main memory database recovery. In Proceedings of 1986 ACM Fall joint computer conference, pages 1226--1232. IEEE Computer Society Press, 1986. Google ScholarDigital Library
- R. Fang, H.-I. Hsiao, B. He, C. Mohan, and Y. Wang. High performance database logging using storage class memory. In ICDE, pages 1221--1231. IEEE, 2011. Google ScholarDigital Library
- R. B. Hagmann. Reimplementing the cedar file system using logging and group commit. In SOSP, pages 155--162, 1987. Google ScholarDigital Library
- S. Harizopoulos, D. J. Abadi, S. Madden, and M. Stonebraker. OLTP through the looking glass, and what we found there. In SIGMOD, pages 981--992, 2008. Google ScholarDigital Library
- H. V. Jagadish, D. F. Lieuwen, R. Rastogi, A. Silberschatz, and S. Sudarshan. Dalı: A high performance main memory storage manager. In VLDB, pages 48--59, 1994. Google ScholarDigital Library
- H. V. Jagadish, A. Silberschatz, and S. Sudarshan. Recovering from main-memory lapses. In VLDB, pages 391--404, 1993. Google ScholarDigital Library
- R. Johnson, I. Pandis, R. Stoica, M. Athanassoulis, and A. Ailamaki. Aether: A scalable approach to logging. In PVLDB, 3(1):681--692, 2010. Google ScholarDigital Library
- R. Kallman, H. Kimura, J. Natkins, A. Pavlo, A. Rasin, S. B. Zdonik, E. P. C. Jones, S. Madden, M. Stonebraker, Y. Zhang, J. Hugg, and D. J. Abadi. H-store: a high-performance, distributed main memory transaction processing system. In PVLDB, 1(2):1496--1499, 2008. Google ScholarDigital Library
- A. Kemper and T. Neumann. Hyper: A hybrid oltp&olap main memory database system based on virtual memory snapshots. In ICDE, pages 195--206, 2011. Google ScholarDigital Library
- T. J. Lehman and M. J. Carey. A recovery algorithm for A high-performance memory-resident database system. In SIGMOD, pages 104--117, 1987. Google ScholarDigital Library
- X. Li and M. H. Eich. Post-crash log processing for fuzzy checkpointing main memory databases. In ICDE, pages 117--124. IEEE, 1993. Google ScholarDigital Library
- Q. Lin, P. Chang, G. Chen, B. C. Ooi, K.-L. Tan, and Z. Wang. Towards a non-2pc transaction management in distributed database systems. In SIGMOD. ACM, 2016. Google ScholarDigital Library
- D. B. Lomet, K. Tzoumas, and M. J. Zwilling. Implementing performance competitive logical recovery. In PVLDB, 4(7):430--439, 2011. Google ScholarDigital Library
- N. Malviya, A. Weisberg, S. Madden, and M. Stonebraker. Rethinking main memory OLTP recovery. In ICDE, pages 604--615, 2014.Google ScholarCross Ref
- C. Mohan, D. J. Haderle, B. G. Lindsay, H. Pirahesh, and P. M. Schwarz. ARIES: A transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. In TODS, 17(1):94--162, 1992. Google ScholarDigital Library
- D. Ongaro, S. M. Rumble, R. Stutsman, J. K. Ousterhout, and M. Rosenblum. Fast crash recovery in ramcloud. In SOSP, pages 29--41, 2011. Google ScholarDigital Library
- I. Oukid, W. Lehner, T. Kissinger, T. Willhalm, and P. Bumbulis. Instant recovery for main-memory databases. In CIDR, 2015.Google Scholar
- J. Ousterhout, P. Agrawal, D. Erickson, C. Kozyrakis, J. Leverich, D. Mazières, S. Mitra, A. Narayanan, D. Ongaro, G. Parulkar, et al. The case for ramcloud. In Communications of the ACM, 54(7):121--130, 2011. Google ScholarDigital Library
- I. Pandis, R. Johnson, N. Hardavellas, and A. Ailamaki. Data-oriented transaction execution. In PVLDB, 3(1):928--939, 2010. Google ScholarDigital Library
- I. Pandis, P. Tözün, R. Johnson, and A. Ailamaki. PLP: page latch-free shared-everything OLTP. In PVLDB, 4(10):610--621, 2011. Google ScholarDigital Library
- S. Pelley, T. F. Wenisch, B. T. Gold, and B. Bridge. Storage management in the nvram era. In PVLDB, 7(2):121--132, 2013. Google ScholarDigital Library
- E. Pinheiro, W.-D. Weber, and L. A. Barroso. Failure trends in a large disk drive population. In FAST, pages 17--23, 2007. Google ScholarDigital Library
- C. Pu. On-the-fly, incremental, consistent reading of entire databases. In Algorithmica, 1(1--4):271--287, 1986.Google ScholarCross Ref
- F. Roos and S. Lindah. Distribution system component failure rates and repair times--an overview. In NORDAC. Citeseer, 2004.Google Scholar
- D. J. Rosenkrantz. Dynamic database dumping. In SIGMOD, pages 3--8, 1978. Google ScholarDigital Library
- K. Salem and H. Garcia-Molina. Checkpointing memory-resident databases. In ICDE, pages 452--462, 1989. Google ScholarDigital Library
- K. Salem and H. Garcia-Molina. System M: A transaction processing testbed for memory resident data. In TKDE, 2(1):161--172, 1990. Google ScholarDigital Library
- B. Schroeder, G. Gibson, et al. A large-scale study of failures in high-performance computing systems. In TDSC, 7(4):337--350, 2010. Google ScholarDigital Library
- M. Stonebraker, D. J. Abadi, A. Batkin, X. Chen, M. Cherniack, M. Ferreira, E. Lau, A. Lin, S. Madden, E. O'Neil, et al. C-store: a column-oriented dbms. In VLDB, pages 553--564, 2005. Google ScholarDigital Library
- K.-L. Tan, Q. Cai, B. C. Ooi, W.-F. Wong, C. Yao, and H. Zhang. In-memory databases: Challenges and opportunities from software and hardware perspectives. In ACM SIGMOD Record, 44(2):35--40, 2015. Google ScholarDigital Library
- K. V. Vishwanath and N. Nagappan. Characterizing cloud computing hardware reliability. In SoCC, pages 193--204. ACM, 2010. Google ScholarDigital Library
- T. Wang and R. Johnson. Scalable logging through emerging non-volatile memory. In PVLDB, 7(10):865--876, 2014. Google ScholarDigital Library
- C. Yao, D. Agrawal, P. Chang, G. Chen, B. C. Ooi, W.-F. Wong, and M. Zhang. Exploiting single-threaded model in multi-core systems. arXiv preprint arXiv:1503.03642, 2015.Google Scholar
- H. Zhang, G. Chen, B. C. Ooi, K. Tan, and M. Zhang. In-memory big data management and processing: A survey. In TKDE, 27(7):1920--1948, 2015.Google ScholarDigital Library
- W. Zheng, S. Tu, E. Kohler, and B. Liskov. Fast databases with fast durability and recovery through multicore parallelism. In OSDI, pages 465--477, Oct. 2014. Google ScholarDigital Library
Index Terms
- Adaptive Logging: Optimizing Logging and Recovery Costs in Distributed In-memory Databases
Recommendations
Scalable and adaptive log manager in distributed systems
AbstractOn-line transaction processing (OLTP) systems rely on transaction logging and quorum-based consensus protocol to guarantee durability, high availability and strong consistency. This makes the log manager a key component of distributed database ...
Adaptive logging for mobile device
Nowadays, due to the increased user requirements of the fast and reliable data management operation for mobile applications, major device vendors use embedded DBMS for their mobile devices such as MP3 players, mobile phones, digital cameras and PDAs. ...
Comments