research-article

Index checkpoints for instant recovery in in-memory database systems

Authors:
Leon Lee

Lab of HuaweiCloud, Beijing, China

Lab of HuaweiCloud, Beijing, China
View Profile

,
Siphrey Xie

Lab of HuaweiCloud, Beijing, China

Lab of HuaweiCloud, Beijing, China
View Profile

,
Yunus Ma

Lab of HuaweiCloud, Beijing, China

Lab of HuaweiCloud, Beijing, China
View Profile

,
Shimin Chen

Chinese Academy of Sciences, Beijing, China

Chinese Academy of Sciences, Beijing, China
View Profile

Proceedings of the VLDB Endowment Volume 15 Issue 8pp 1671–1683https://doi.org/10.14778/3529337.3529350

Published:01 April 2022Publication History

Proceedings of the VLDB Endowment

Abstract

We observe that the time bottleneck during the recovery phase of an IMDB (In-Memory DataBase system) shifts from log replaying to index rebuilding after the state-of-art techniques for instant recovery have been applied. In this paper, we investigate index checkpoints to eliminate this bottleneck. However, improper designs may lead to inconsistent index checkpoints or incur severe performance degradation. For the correctness challenge, we combine two techniques, i.e., deferred deletion of index entries, and on-demand clean-up of dangling index entries after recovery, to achieve data correctness. For the efficiency challenge, we propose three wait-free index checkpoint algorithms, i.e., ChainIndex, MirrorIndex, IACoW, for supporting efficient normal processing and fast recovery. We implement our proposed solutions in HiEngine, an IMDB being developed as part of Huawei's next-generation cloud-native database product. We evaluate the impact of index checkpoint persistence on recovery and transaction performance using two workloads (i.e., TPC-C and Microbench). We analyze the pros and cons of each algorithm. Our experimental results show that HiEngine can be recovered instantly (i.e., in ~10 s) with only slight (i.e., 5% - 11%) performance degradation. Therefore, we strongly recommend integrating index checkpointing into IMDBs if recovery time is a crucial product metric.

References

Joy Arulraj, Matthew Perron, and Andrew Pavlo. 2016. Write-Behind Logging. Proc. VLDB Endow. 10 (2016), 337--348.Google ScholarDigital Library
Robert Binna, Eva Zangerle, M. Pichl, Günther Specht, and Viktor Leis. 2018. HOT: A Height Optimized Trie Index for Main-Memory Database Systems. In SIGMOD.Google ScholarDigital Library
Tuan Cao, M. V. Salles, B. Sowell, Yao Yue, A. Demers, J. Gehrke, and Walker M. White. 2011. Fast checkpoint recovery algorithms for frequently consistent applications. In SIGMOD.Google Scholar
D. DeWitt, R. Katz, F. Olken, Leonard D. Shapiro, M. Stonebraker, and D. Wood. 1984. Implementation techniques for main memory database systems. In SIGMOD.Google Scholar
J. Gehrke and Tuan Cao. 2013. Fault tolerance for main-memory applications in the cloud.Google Scholar
Goetz Graefe. 2012. A survey of B-tree logging and recovery techniques. ACM Transactions on Database Systems (TODS) 37 (2012), 1 -- 35.Google ScholarDigital Library
Theo Haerder and Andreas Reuter. 1983. Principles of transaction-oriented database recovery. ACM Comput. Surv. 15 (1983), 287--317.Google ScholarDigital Library
Michael Haubenschild, Caetano Sauer, Thomas Neumann, and Viktor Leis. 2020. Rethinking Logging, Checkpoints, and Recovery for High-Performance Storage Engines. In SIGMOD.Google Scholar
Hironobu. [n.d.]. The internals of postgresql for database administrators and system developers. http://www.interdb.jp/pg/Google Scholar
A. Kemper and Thomas Neumann. 2011. HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots. In ICDE. 195--206.Google Scholar
Kangnyeon Kim, Tianzheng Wang, R. Johnson, and I. Pandis. 2016. ERMIA: Fast Memory-Optimized Database System for Heterogeneous Workloads. In SIGMOD.Google ScholarDigital Library
P. Larson, Spyros Blanas, C. Diaconu, C. Freedman, J. M. Patel, and Mike Zwilling. 2011. High-Performance Concurrency Control Mechanisms for Main-Memory Databases. Proc. VLDB Endow. 5 (2011), 298--309.Google ScholarDigital Library
Viktor Leis, Michael Haubenschild, A. Kemper, and Thomas Neumann. 2018. LeanStore: In-Memory Data Management beyond Main Memory. ICDE, 185--196.Google Scholar
Viktor Leis, Michael Haubenschild, and T. Neumann. 2019. Optimistic Lock Coupling: A Scalable and Efficient General-Purpose Synchronization Method. IEEE TKDE 42 (2019), 73--84.Google Scholar
Viktor Leis, A. Kemper, and Thomas Neumann. 2013. The adaptive radix tree: ARTful indexing for main-memory databases. ICDE, 38--49.Google Scholar
Viktor Leis, Florian Scheibner, A. Kemper, and T. Neumann. 2016. The ART of practical synchronization. In DaMoN '16.Google Scholar
Justin J. Levandoski, D. Lomet, and S. Sengupta. 2013. The Bw-Tree: A B-tree for new hardware platforms. ICDE, 302--313.Google Scholar
Justin J. Levandoski, D. Lomet, and S. Sengupta. 2013. LLAMA: A Cache/Storage Subsystem for Modern Hardware. Proc. VLDB Endow. 6 (2013), 877--888.Google ScholarDigital Library
L. Li, Guoren Wang, Gang Wu, and Ye Yuan. 2018. Consistent Snapshot Algorithms for In-Memory Database Systems: Experiments and Analysis. ICDE, 1284--1287.Google Scholar
L. Li, Guoren Wang, Gang Wu, Ye Yuan, Lei Chen, and Xiang Lian. 2021. A Comparative Study of Consistent Snapshot Algorithms for Main-Memory Database Systems. IEEE TKDE 33 (2021), 316--330.Google Scholar
Antti-Pekka Liedes and A. Wolski. 2006. SIREN: A Memory-Conserving, Snapshot-Consistent Checkpoint Algorithm for in-Memory Databases. ICDE, 99--99.Google Scholar
Hyeontaek Lim, M. Kaminsky, and D. Andersen. 2017. Cicada: Dependably Fast Multi-Core In-Memory Transactions. In SIGMOD.Google ScholarDigital Library
Gangcai Liu, Leying Chen, and Shimin Chen. 2021. Zen: a High-Throughput Log-Free OLTP Engine for Non-Volatile Main Memory. Proc. VLDB Endow. 14 (2021), 835--848.Google ScholarDigital Library
David B. Lomet and Betty Salzberg. 1997. Concurrency and recovery for index trees. The VLDB Journal 6 (1997), 224--240.Google ScholarDigital Library
Yunus Ma, Siphrey Xie, Henry Zhong, Leon Lee, and King Lv. 2022. HiEngine: How to Architect a Cloud-Native Memory-Optimized Database Engine.. In SIGMOD.Google Scholar
Arlino Magalhães, Angelo Brayner, José Maria S. Monteiro, and Gustavo Moraes. 2021. Indexed Log File: Towards Main Memory Database Instant Recovery. In EDBT.Google Scholar
Nirmesh Malviya, Ariel Weisberg, S. Madden, and M. Stonebraker. 2014. Rethinking main memory OLTP recovery. ICDE, 604--615.Google Scholar
Yandong Mao, E. Kohler, and R. Morris. 2012. Cache craftiness for fast multicore key-value storage. In EuroSys '12.Google Scholar
C. Mohan, Donald J. Haderle, B. Lindsay, H. Pirahesh, and P. Schwarz. 1992. ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. ACM Trans. Database Syst. 17 (1992), 94--162.Google ScholarDigital Library
Thomas Neumann, Tobias Mühlbauer, and A. Kemper. 2015. Fast Serializable Multi-Version Concurrency Control for Main-Memory Database Systems. In SIGMOD.Google Scholar
Andrew Pavlo, Gustavo Angulo, Joy Arulraj, Haibin Lin, Jiexi Lin, Lin Ma, Prashanth Menon, T. Mowry, Matthew Perron, Ian Quah, Siddharth Santurkar, A. Tomasic, S. Toor, D. V. Aken, Ziqi Wang, Yingjun Wu, Ran Xian, and Tieying Zhang. 2017. Self-Driving Database Management Systems. In CIDR.Google Scholar
PMDK. [n.d.]. using-persistent-memory-devices-with-the-linux-device-mapper. https://pmem.io/2018/05/15/using_persistent_memory_devices_with_the_linux_device_mapper.htmlGoogle Scholar
J. Rao and K. A. Ross. 2000. Making B+- trees cache conscious in main memory. In SIGMOD.Google Scholar
Kun Ren, Thaddeus Diamond, D. Abadi, and Alexander Thomson. 2016. Low-Overhead Asynchronous Checkpointing in Main-Memory Database Systems. In SIGMOD.Google Scholar
K. Salem and H. Garcia-Molina. 1990. System M: A Transaction Processing Testbed for Memory Resident Data. IEEE TKDE 2 (1990), 161--172.Google Scholar
M. V. Salles, Tuan Cao, B. Sowell, A. Demers, J. Gehrke, Christoph E. Koch, and Walker M. White. 2009. An Evaluation of Checkpoint Recovery for Massively Multiplayer Online Games. Proc. VLDB Endow. 2 (2009), 1258--1269.Google ScholarDigital Library
Caetano Sauer. 2019. Modern techniques for transaction-oriented database recovery1.Google Scholar
Caetano Sauer, G. Graefe, and T. Härder. 2017. Instant Restore After a Media Failure. In ADBIS.Google Scholar
Caetano Sauer, G. Graefe, and T. Härder. 2018. FineLine: log-structured transactional storage and recovery. Proc. VLDB Endow. 11 (2018), 2249--2262.Google ScholarDigital Library
Stephen Tu, Wenting Zheng, E. Kohler, B. Liskov, and S. Madden. 2013. Speedy transactions in multicore in-memory databases. Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles.Google Scholar
Alexandre Verbitski, Anurag Gupta, D. Saha, Murali Brahmadesam, K. Gupta, Raman Mittal, S. Krishnamurthy, Sandor Maurice, T. Kharatishvili, and Xiaofeng Bao. 2017. Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases. In SIGMOD.Google ScholarDigital Library
Ziqi Wang, Andrew Pavlo, Hyeontaek Lim, Viktor Leis, Huanchen Zhang, M. Kaminsky, and D. Andersen. 2018. Building a Bw-Tree Takes More Than Just Buzz Words. In SIGMOD.Google Scholar
Yingjun Wu, Joy Arulraj, Jiexi Lin, R. Xian, and A. Pavlo. 2017. An Empirical Evaluation of In-Memory Multi-Version Concurrency Control. Proc. VLDB Endow. 10 (2017), 781--792.Google ScholarDigital Library
Yingjun Wu, Wentian Guo, Chee-Yong Chan, and Kian-Lee Tan. 2017. Fast Failure Recovery for Main-Memory DBMSs on Multicores. In SIGMOD (Chicago, Illinois, USA) (SIGMOD '17). Association for Computing Machinery, New York, NY, USA, 267--281. Google ScholarDigital Library
Yu Xia, Xiangyao Yu, A. Pavlo, and S. Devadas. 2020. Taurus: Lightweight Parallel Logging for In-Memory Database Management Systems (Extended Version). ArXiv abs/2010.06760 (2020).Google Scholar
Chang Yao, D. Agrawal, G. Chen, B. Ooi, and Sai Wu. 2016. Adaptive Logging: Optimizing Logging and Recovery Costs in Distributed In-memory Databases. In SIGMOD.Google ScholarDigital Library
Hao Zhang, Gang Chen, B. Ooi, K. Tan, and Meihui Zhang. 2015. In-Memory Big Data Management and Processing: A Survey. IEEE TKDE 27 (2015), 1920--1948.Google ScholarDigital Library
L. Zhang, Matthew Butrovich, Tianyu Li, Yash Nannapanei, A. Pavlo, John Rollinson, Huanchen Zhang, Ambarish Balakumar, Daniel Biales, Ziqi Dong, Emmanuel Eppinger, J. González, Wan Shen Lim, Jianqiao Liu, Prashanth Menon, Soumil Mukherjee, Tanuj Nayak, Amadou Ngom, Jeff Niu, Deepayan Patra, Poojita Raj, Stephanie Wang, Wuwen Wang, Yao Yu, and W. Zhang. 2020. Everything is a Transaction: Unifying Logical Concurrency Control and Physical Data Structure Maintenance in Database Management Systems.Google Scholar
Wenting Zheng, Stephen Tu, E. Kohler, and B. Liskov. 2014. Fast Databases with Fast Durability and Recovery Through Multicore Parallelism. In OSDI.Google Scholar

Recommendations

Main Memory Database Recovery: A Survey

Many of today’s applications need massive real-time data processing. In-memory database systems have become a good alternative for these requirements. These systems maintain the primary copy of the database in the main memory to achieve high throughput ...
Read More
Checkpointing and recovery in distributed and database systems
Read More
Implementing rollback-recovery coordinated checkpoints
ISSADS'05: Proceedings of the 5th international conference on Advanced Distributed Systems

Recovering from processor failures in distributed systems is an important problem in the design of reliable systems. The processes should coordinate their operation to guarantee that the set of local checkpoints taken by the individual processes form a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
Proceedings of the VLDB Endowment Volume 15, Issue 8
April 2022
220 pages
ISSN:2150-8097
Editors:
Fatma Özcan
Google
,
Juliana Freire
New York University
,
Xuemin Lin
University of New South Wales
Issue’s Table of Contents
Sponsors
In-Cooperation
Publisher
VLDB Endowment
Publication History
- Published: 1 April 2022
Published in pvldb Volume 15, Issue 8
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 197
  Total Downloads
- Downloads (Last 12 months)62
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Index checkpoints for instant recovery in in-memory database systems

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Recommendations

Main Memory Database Recovery: A Survey

Checkpointing and recovery in distributed and database systems

Implementing rollback-recovery coordinated checkpoints

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Index checkpoints for instant recovery in in-memory database systems

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Recommendations

Main Memory Database Recovery: A Survey

Checkpointing and recovery in distributed and database systems

Implementing rollback-recovery coordinated checkpoints

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media