ABSTRACT
As widely used indices in key-value stores, the Log-Structured Merge-tree (LSM-tree) and its variants suffer from severe write amplification due to frequent merges in compactions for write-intensive applications. To address the problem, we first propose the Log-Structured Append-tree (LSA-tree), which tries to compact data with appends instead of merges, significantly reduces the write amplification and solves the issues existed in current append trees. However LSA increases read and space amplifications. Furthermore based on LSA, we design the Integrated Append/Merge-tree (IAM-tree). IAM selects appends or merges in compaction operations according to the size of memory-cached data. Theoretical analysis shows that IAM reduces the write amplification of LSM while keep the same read and space amplification.
We implement IAM as a user library named IamDB. Experiments show that its write amplification is much less than that of LSM, only 8.71 vs. 19.00 for 1TB data with 64GB memory. Compared with nicely tuned LevelDB and RocksDB, IamDB provides 1.4-2.7× and 1.6-1.9× better write throughput, saves 12% and 10% disk space respectively, as well as the comparable read and scan performance. At the meantime IamDB achieves the most stable tail latency.
- Oana Balmau, Rachid Guerraoui, Vasileios Trigonakis, and Igor Zablotchi. 2017. FloDB: Unlocking memory in persistent key-value stores. In Proceedings of EuroSys'17. 80--94. Google ScholarDigital Library
- Burton H Bloom. 1970. Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13, 7 (1970), 422--426. Google ScholarDigital Library
- Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C Hsieh, Deborah A Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E Gruber. 2008. Bigtable: A distributed storage system for structured data. TOCS 26, 2 (2008), 4. Google ScholarDigital Library
- Niv Dayan and Stratos Idreos. 2018. Dostoevsky: Better Space-Time Trade-Offs for LSM-Tree Based Key-Value Stores via Adaptive Removal of Superfluous Merging. In Proceedings of SIGMOD'18. 505--520. Google ScholarDigital Library
- Facebook. 2017. Write stalls. https://github.com/facebook/rocksdb/wiki/writestalls.Google Scholar
- Facebook. 2019. MyRocks. http://myrocks.io/.Google Scholar
- Facebook. 2019. RocksDB. https://rocksdb.org/.Google Scholar
- Google. 2019. LevelDB. http://leveldb.org/.Google Scholar
- Sudarsun Kannan, Nitish Bhat, Ada Gavrilovska, Andrea ArpaciDusseau, and Remzi Arpaci-Dusseau. 2018. Redesigning LSMs for nonvolatile memory with NoveLSM. In Proceedings of ATC'18. 993--1005. Google ScholarDigital Library
- Avinash Lakshman and Prashant Malik. 2010. Cassandra: A decentralized structured storage system. ACM SIGOPS Operating Systems Review 44, 2(2010), 35--40. Google ScholarDigital Library
- Lanyue Lu, Thanumalayan Sankaranarayana Pillai, Andrea C Arpacidusseau, and Remzi H Arpacidusseau. 2016. WiscKey: Separating keys from values in SSD-conscious storage. In Proceedings of FAST'16. 133--148. Google ScholarDigital Library
- Chen Luo and Michael J. Carey. 2018. LSM-based Storage Techniques: A Survey. https://arxiv.org/abs/1812.07527.Google Scholar
- Patrick O'Neil, Edward Cheng, Dieter Gawlick, and Elizabeth O'Neil. 1996. The log-structured merge-tree (LSM-tree). Acta Informatica 33, 4 (1996), 351--385. Google ScholarDigital Library
- Raju Pandian, Kadekodi Rohan, Chidambaram Vijay, and Abraham Ittai. 2017. PebblesDB: Building key-value stores using fragmented log-structured merge trees. In Proceedings of SOSP'17. 497--514. Google ScholarDigital Library
- Kai Ren and Garth A Gibson. 2013. TABLEFS: enhancing metadata efficiency in the local file system. In Proceedings of ATC'13. 145--156. Google ScholarDigital Library
- Kai Ren, Qing Zheng, Joy Arulraj, and Garth Gibson. 2017. SlimDB: A space-efficient key-value storage engine for semi-sorted data. In Proceedings of the VLDB Endowment, Vol. 10. 2037--2048. Google ScholarDigital Library
- Russell Sears and Raghu Ramakrishnan. 2012. bLSM: a general purpose log structured merge tree. In Proceedings of SIGMOD'12. ACM, 217--228. Google ScholarDigital Library
- Hoang Tam Vo, Sheng Wang, Divyakant Agrawal, Gang Chen, and Beng Chin Ooi. 2012. LogBase: a scalable log-structured database system in the cloud. Proceedings of the VLDB Endowment 5 (2012), 1004--1015. Google ScholarDigital Library
- Wu Xingbo, Xu Yuehai, Shao Zili, and Jiang Song. 2015. LSM-trie: An LSM-tree-based ultra-large key-value store for small data. In Proceedings of ATC'15. 71--82. Google ScholarDigital Library
- Ting Yao, Jiguang Wan, Ping Huang, Xubin He, Qingxin Gui, Fei Wu, and Changsheng Xie. 2017. A light-weight compaction tree to reduce I/O amplification toward efficient key-value stores. In Proceedings of MSST'17.Google Scholar
- Huanchen Zhang, Hyeontaek Lim, Viktor Leis, David G Andersen, Michael Kaminsky, Kimberly Keeton, and Andrew Pavlo. 2018. SuRF: Practical range query filtering with fast succinct tries. In Proceedings of SIGMOD'18. 323--336. Google ScholarDigital Library
Recommendations
Splaying Log-Structured Merge-Trees
SIGMOD '18: Proceedings of the 2018 International Conference on Management of DataModern persistent key-value stores typically use a log-structured merge-tree (LSM-tree) design, which allows for high write throughput. Our observation is that the LSM-tree, however, has suboptimal performance during read-intensive workload windows with ...
LSM-tree managed storage for large-scale key-value store
SoCC '17: Proceedings of the 2017 Symposium on Cloud ComputingKey-value stores are increasingly adopting LSM-trees as their enabling data structure in the backend storage, and persisting their clustered data through a file system. A file system is expected to not only provide file/directory abstraction to organize ...
Building Efficient Key-Value Stores via a Lightweight Compaction Tree
Special Issue on MSST 2017 and Regular PapersLog-Structure Merge tree (LSM-tree) has been one of the mainstream indexes in key-value systems supporting a variety of write-intensive Internet applications in today’s data centers. However, the performance of LSM-tree is seriously hampered by ...
Comments