research-article

Open Access

Adaptive Logging: Optimizing Logging and Recovery Costs in Distributed In-memory Databases

Authors:
Chang Yao

National University of Singapore, Singapore, Singapore

National University of Singapore, Singapore, Singapore
View Profile

,
Divyakant Agrawal

University of California at Santa Barbara, Santa Barbara, CA, USA

University of California at Santa Barbara, Santa Barbara, CA, USA
View Profile

,
Gang Chen

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China
View Profile

,
Beng Chin Ooi

National University of Singapore, Singapore, Singapore

National University of Singapore, Singapore, Singapore
View Profile

,
Sai Wu

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China
View Profile

SIGMOD '16: Proceedings of the 2016 International Conference on Management of DataJune 2016Pages 1119–1134https://doi.org/10.1145/2882903.2915208

Published:14 June 2016Publication History

SIGMOD '16: Proceedings of the 2016 International Conference on Management of Data

Pages 1119–1134

ABSTRACT

By maintaining the data in main memory, in-memory databases dramatically reduce the I/O cost of transaction processing. However, for recovery purposes, in-memory systems still need to flush the log to disk, which incurs a substantial number of I/Os. Recently, command logging has been proposed to replace the traditional data log (e.g., ARIES logging) in in-memory databases. Instead of recording how the tuples are updated, command logging only tracks the transactions that are being executed, thereby effectively reducing the size of the log and improving the performance. However, when a failure occurs, all the transactions in the log after the last checkpoint must be redone sequentially and this significantly increases the cost of recovery. In this paper, we first extend the command logging technique to a distributed system, where all the nodes can perform their recovery in parallel. We show that in a distributed system, the only bottleneck of recovery caused by command logging is the synchronization process that attempts to resolve the data dependency among the transactions. We then propose an adaptive logging approach by combining data logging and command logging. The percentage of data logging versus command logging becomes a tuning knob between the performance of transaction processing and recovery to meet different OLTP requirements, and a model is proposed to guide such tuning. Our experimental study compares the performance of our proposed adaptive logging, ARIES-style data logging and command logging on top of H-Store. The results show that adaptive logging can achieve a 10x boost for recovery and a transaction throughput that is comparable to that of command logging.

References

MemSQL.burlhttp://www.memsql.com.Google Scholar
Postgresql 8.3.23 documentation,chapter 28. reliability and the write-ahead log. http://www.postgresql.org/docs/8.3/static/wal-async-commit.html. Accessed: 2015--6-06.Google Scholar
SAP HANA Wrings Performance from New Intel Xeons. http://www.enterprisetech.com/2014/02/19/sap-hana-wrings-performance-new-intel-xeons/.Google Scholar
J. Arulraj, A. Pavlo, and S. R. Dulloor. Let's talk about storage & recovery methods for non-volatile memory database systems. In SIGMOD, pages 707--722. ACM, 2015. Google ScholarDigital Library
P. Bailis, A. Ghodsi, J. M. Hellerstein, and I. Stoica. Bolt-on causal consistency. In SIGMOD, pages 761--772, 2013. Google ScholarDigital Library
M. J. Cahill, U. Röhm, and A. D. Fekete. Serializable isolation for snapshot databases. In TODS, 34(4):20, 2009. Google ScholarDigital Library
T. Cao, M. A. V. Salles, B. Sowell, Y. Yue, A. J. Demers, J. Gehrke, and W. M. White. Fast checkpoint recovery algorithms for frequently consistent applications. In SIGMOD, pages 265--276, 2011. Google ScholarDigital Library
D. J. DeWitt, R. H. Katz, F. Olken, L. D. Shapiro, M. Stonebraker, and D. A. Wood. Implementation techniques for main memory database systems. In SIGMOD, pages 1--8, 1984. Google ScholarDigital Library
M. H. Eich. Main memory database recovery. In Proceedings of 1986 ACM Fall joint computer conference, pages 1226--1232. IEEE Computer Society Press, 1986. Google ScholarDigital Library
R. Fang, H.-I. Hsiao, B. He, C. Mohan, and Y. Wang. High performance database logging using storage class memory. In ICDE, pages 1221--1231. IEEE, 2011. Google ScholarDigital Library
R. B. Hagmann. Reimplementing the cedar file system using logging and group commit. In SOSP, pages 155--162, 1987. Google ScholarDigital Library
S. Harizopoulos, D. J. Abadi, S. Madden, and M. Stonebraker. OLTP through the looking glass, and what we found there. In SIGMOD, pages 981--992, 2008. Google ScholarDigital Library
H. V. Jagadish, D. F. Lieuwen, R. Rastogi, A. Silberschatz, and S. Sudarshan. Dalı: A high performance main memory storage manager. In VLDB, pages 48--59, 1994. Google ScholarDigital Library
H. V. Jagadish, A. Silberschatz, and S. Sudarshan. Recovering from main-memory lapses. In VLDB, pages 391--404, 1993. Google ScholarDigital Library
R. Johnson, I. Pandis, R. Stoica, M. Athanassoulis, and A. Ailamaki. Aether: A scalable approach to logging. In PVLDB, 3(1):681--692, 2010. Google ScholarDigital Library
R. Kallman, H. Kimura, J. Natkins, A. Pavlo, A. Rasin, S. B. Zdonik, E. P. C. Jones, S. Madden, M. Stonebraker, Y. Zhang, J. Hugg, and D. J. Abadi. H-store: a high-performance, distributed main memory transaction processing system. In PVLDB, 1(2):1496--1499, 2008. Google ScholarDigital Library
A. Kemper and T. Neumann. Hyper: A hybrid oltp&olap main memory database system based on virtual memory snapshots. In ICDE, pages 195--206, 2011. Google ScholarDigital Library
T. J. Lehman and M. J. Carey. A recovery algorithm for A high-performance memory-resident database system. In SIGMOD, pages 104--117, 1987. Google ScholarDigital Library
X. Li and M. H. Eich. Post-crash log processing for fuzzy checkpointing main memory databases. In ICDE, pages 117--124. IEEE, 1993. Google ScholarDigital Library
Q. Lin, P. Chang, G. Chen, B. C. Ooi, K.-L. Tan, and Z. Wang. Towards a non-2pc transaction management in distributed database systems. In SIGMOD. ACM, 2016. Google ScholarDigital Library
D. B. Lomet, K. Tzoumas, and M. J. Zwilling. Implementing performance competitive logical recovery. In PVLDB, 4(7):430--439, 2011. Google ScholarDigital Library
N. Malviya, A. Weisberg, S. Madden, and M. Stonebraker. Rethinking main memory OLTP recovery. In ICDE, pages 604--615, 2014.Google ScholarCross Ref
C. Mohan, D. J. Haderle, B. G. Lindsay, H. Pirahesh, and P. M. Schwarz. ARIES: A transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. In TODS, 17(1):94--162, 1992. Google ScholarDigital Library
D. Ongaro, S. M. Rumble, R. Stutsman, J. K. Ousterhout, and M. Rosenblum. Fast crash recovery in ramcloud. In SOSP, pages 29--41, 2011. Google ScholarDigital Library
I. Oukid, W. Lehner, T. Kissinger, T. Willhalm, and P. Bumbulis. Instant recovery for main-memory databases. In CIDR, 2015.Google Scholar
J. Ousterhout, P. Agrawal, D. Erickson, C. Kozyrakis, J. Leverich, D. Mazières, S. Mitra, A. Narayanan, D. Ongaro, G. Parulkar, et al. The case for ramcloud. In Communications of the ACM, 54(7):121--130, 2011. Google ScholarDigital Library
I. Pandis, R. Johnson, N. Hardavellas, and A. Ailamaki. Data-oriented transaction execution. In PVLDB, 3(1):928--939, 2010. Google ScholarDigital Library
I. Pandis, P. Tözün, R. Johnson, and A. Ailamaki. PLP: page latch-free shared-everything OLTP. In PVLDB, 4(10):610--621, 2011. Google ScholarDigital Library
S. Pelley, T. F. Wenisch, B. T. Gold, and B. Bridge. Storage management in the nvram era. In PVLDB, 7(2):121--132, 2013. Google ScholarDigital Library
E. Pinheiro, W.-D. Weber, and L. A. Barroso. Failure trends in a large disk drive population. In FAST, pages 17--23, 2007. Google ScholarDigital Library
C. Pu. On-the-fly, incremental, consistent reading of entire databases. In Algorithmica, 1(1--4):271--287, 1986.Google ScholarCross Ref
F. Roos and S. Lindah. Distribution system component failure rates and repair times--an overview. In NORDAC. Citeseer, 2004.Google Scholar
D. J. Rosenkrantz. Dynamic database dumping. In SIGMOD, pages 3--8, 1978. Google ScholarDigital Library
K. Salem and H. Garcia-Molina. Checkpointing memory-resident databases. In ICDE, pages 452--462, 1989. Google ScholarDigital Library
K. Salem and H. Garcia-Molina. System M: A transaction processing testbed for memory resident data. In TKDE, 2(1):161--172, 1990. Google ScholarDigital Library
B. Schroeder, G. Gibson, et al. A large-scale study of failures in high-performance computing systems. In TDSC, 7(4):337--350, 2010. Google ScholarDigital Library
M. Stonebraker, D. J. Abadi, A. Batkin, X. Chen, M. Cherniack, M. Ferreira, E. Lau, A. Lin, S. Madden, E. O'Neil, et al. C-store: a column-oriented dbms. In VLDB, pages 553--564, 2005. Google ScholarDigital Library
K.-L. Tan, Q. Cai, B. C. Ooi, W.-F. Wong, C. Yao, and H. Zhang. In-memory databases: Challenges and opportunities from software and hardware perspectives. In ACM SIGMOD Record, 44(2):35--40, 2015. Google ScholarDigital Library
K. V. Vishwanath and N. Nagappan. Characterizing cloud computing hardware reliability. In SoCC, pages 193--204. ACM, 2010. Google ScholarDigital Library
T. Wang and R. Johnson. Scalable logging through emerging non-volatile memory. In PVLDB, 7(10):865--876, 2014. Google ScholarDigital Library
C. Yao, D. Agrawal, P. Chang, G. Chen, B. C. Ooi, W.-F. Wong, and M. Zhang. Exploiting single-threaded model in multi-core systems. arXiv preprint arXiv:1503.03642, 2015.Google Scholar
H. Zhang, G. Chen, B. C. Ooi, K. Tan, and M. Zhang. In-memory big data management and processing: A survey. In TKDE, 27(7):1920--1948, 2015.Google ScholarDigital Library
W. Zheng, S. Tu, E. Kohler, and B. Liskov. Fast databases with fast durability and recovery through multicore parallelism. In OSDI, pages 465--477, Oct. 2014. Google ScholarDigital Library

Index Terms

Adaptive Logging: Optimizing Logging and Recovery Costs in Distributed In-memory Databases
1. Information systems
  1. Data management systems
    1. Database management system engines
      1. Database transaction processing
        Transaction logging

Recommendations

Scalable and adaptive log manager in distributed systems
Abstract
On-line transaction processing (OLTP) systems rely on transaction logging and quorum-based consensus protocol to guarantee durability, high availability and strong consistency. This makes the log manager a key component of distributed database ...
Read More
Adaptive logging for mobile device

Nowadays, due to the increased user requirements of the fast and reliable data management operation for mobile applications, major device vendors use embedded DBMS for their mobile devices such as MP3 players, mobile phones, digital cameras and PDAs. ...
Read More
Logging and Log Management: The Authoritative Guide to Understanding the Concepts Surrounding Logging and Log Management
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGMOD '16: Proceedings of the 2016 International Conference on Management of Data
June 2016
2300 pages
ISBN:9781450335317
DOI:10.1145/2882903
General Chairs:
Fatma Özcan
IBM Research, USA
,
Georgia Koutrika
HP Labs, USA
,
Program Chair:
Sam Madden
Massachusetts Institute of Technology, USA
Copyright © 2016 Owner/Author
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 14 June 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
OLTP
aries logging
command logging
distributed in-memory database
transaction logging
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate785of4,003submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 24
  Total Citations
  View Citations
- 1,287
  Total Downloads
- Downloads (Last 12 months)102
- Downloads (Last 6 weeks)20
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Adaptive Logging: Optimizing Logging and Recovery Costs in Distributed In-memory Databases

SIGMOD '16: Proceedings of the 2016 International Conference on Management of Data

ABSTRACT

References

Cited By

Index Terms

Recommendations

Scalable and adaptive log manager in distributed systems

Adaptive logging for mobile device

Logging and Log Management: The Authoritative Guide to Understanding the Concepts Surrounding Logging and Log Management

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Adaptive Logging: Optimizing Logging and Recovery Costs in Distributed In-memory Databases

SIGMOD '16: Proceedings of the 2016 International Conference on Management of Data

ABSTRACT

References

Cited By

Index Terms

Recommendations

Scalable and adaptive log manager in distributed systems

Adaptive logging for mobile device

Logging and Log Management: The Authoritative Guide to Understanding the Concepts Surrounding Logging and Log Management

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media