ABSTRACT
This paper studies the design of a key-value (KV) store that can take full advantage of modern storage hardware with built-in transparent compression capability. Many modern storage appliances/drives implement hardware-based data compression, transparent to OS and applications. Moreover, the growing deployment of hardware-based compression in Cloud infrastructure leads to the imminent arrival of Cloud-based storage hardware with built-in transparent compression. By decoupling the logical storage space utilization efficiency from the true physical storage usage, transparent compression allows data management software to purposely waste logical storage space in return for simpler data structures and algorithms, leading to lower implementation complexity and higher performance. This work proposes a table-less hash-based KV store, where the basic idea is to hash the key space directly onto the logical storage space without using a hash table at all. With a substantially simplified data structure, this approach is subject to significant logical storage space under-utilization, which can be seamlessly mitigated by storage hardware with transparent compression. This paper presents the basic KV store architecture, and develops mathematical formulations to assist its configuration and analysis. We implemented such a KV store KallaxDB and carried out experiments on a commercial SSD with built-in transparent compression. The results show that, while consuming very little memory resource, it compares favorably with the other modern KV stores in terms of throughput, latency, and CPU usage.
- Apache Cassandra. [n.d.]. http://cassandra.apache.org/.Google Scholar
- AWS Graviton Processor. [n.d.]. https://aws.amazon.com/ec2/graviton/.Google Scholar
- Oana Balmau, Diego Didona, Rachid Guerraoui, Willy Zwaenepoel, Huapeng Yuan, Aashray Arora, Karan Gupta, and Pavan Konka. 2017. TRIAD: Creating Synergies Between Memory, Disk and Log in Log Structured Key-Value Stores. In Proceedings of USENIX Annual Technical Conference (ATC). 363--375.Google Scholar
- Jeff Bonwick, Matt Ahrens, Val Henson, Mark Maybee, and Mark Shellenbaum. 2003. The zettabyte file system. In Proceedings of the Usenix Conference on File and Storage Technologies (FAST), Vol. 215.Google Scholar
- Zhichao Cao, Siying Dong, Sagar Vemuri, and David Du. 2020. Characterizing, modeling, and benchmarking RocksDB key-value workloads at Facebook. In USENIX Conference on File and Storage Technologies (FAST). 209--223.Google Scholar
- Helen HW Chan, Chieh-Jan Mike Liang, Yongkun Li, Wenjia He, Patrick PC Lee, Lianjie Zhu, Yaozu Dong, Yinlong Xu, Yu Xu, Jin Jiang, et al. 2018. HashKV: Enabling Efficient Updates in KV Storage via Hashing. In Proceedings of USENIX Annual Technical Conference (ATC). 1007--1019.Google Scholar
- Derek Chiou, Eric Chung, and Susan Carrie. 2019. (Cloud) Acceleration at Microsoft. Tutorial at Hot Chips (2019).Google Scholar
- Brian Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the ACM Symposium on Cloud Computing (SoCC). ACM, 143--154.Google ScholarDigital Library
- Niv Dayan and Stratos Idreos. 2018. Dostoevsky: Better space-time trade-offs for LSM-tree based key-value stores via adaptive removal of superfluous merging. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, 505--520.Google ScholarDigital Library
- Biplob Debnath, Sudipta Sengupta, and Jin Li. 2010. FlashStore: high throughput persistent key-value store. Proceedings of the VLDB Endowment 3, 1-2 (2010), 1414--1425.Google ScholarDigital Library
- Biplob Debnath, Sudipta Sengupta, and Jin Li. 2011. SkimpyStash: RAM space skimpy key-value store on flash-based storage. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data. 25--36.Google ScholarDigital Library
- Dell EMC PowerMax. [n.d.]. https://delltechnologies.com/.Google Scholar
- E. F. Haratsch. 2019. SSD with Compression: Implementation, Interface and Use Case. In Flash Memory Summit.Google Scholar
- HPE Nimble Storage. [n.d.]. https://www.hpe.com/.Google Scholar
- Gui Huang, Xuntao Cheng, Jianying Wang, Yujie Wang, Dengcheng He, Tieying Zhang, Feifei Li, Sheng Wang, Wei Cao, and Qiang Li. 2019. X-Engine: An optimized storage engine for large-scale E-commerce transaction processing. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, 651--665.Google ScholarDigital Library
- Kornilios Kourtis, Nikolas Ioannou, and Ioannis Koltsidas. 2019. Reaping the performance of fast NVM storage with uDepot. In USENIX Conference on File and Storage Technologies (FAST). 1--15.Google Scholar
- Harold J Larson. 1995. Introduction to Probability. Addison-Wesley, Reading, MA.Google Scholar
- Baptiste Lepers, Oana Balmau, Karan Gupta, and Willy Zwaenepoel. 2019. KVell: the design and implementation of a fast persistent key-value store. In Proceedings of the ACM Symposium on Operating Systems Principles (SOSP). 447--461.Google ScholarDigital Library
- Hyeontaek Lim, Bin Fan, David G Andersen, and Michael Kaminsky. 2011. SILT: A memory-efficient, high-performance key-value store. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles. 1--13.Google ScholarDigital Library
- Lanyue Lu, Thanumalayan Sankaranarayana Pillai, Hariharan Gopalakrishnan, Andrea C Arpaci-Dusseau, and Remzi H Arpaci-Dusseau. 2017. WiscKey: Separating keys from values in SSD-conscious storage. ACM Transactions on Storage (TOS) 13, 1 (2017), 5.Google ScholarDigital Library
- C. Luo and M.J. Carey. 2020. LSM-based storage techniques: a survey. The VLDB Journal 29 (2020), 393--418.Google ScholarDigital Library
- Leonardo Marmol, Swaminathan Sundararaman, Nisha Talagala, and Raju Rangaswami. 2015. NVMKV: A Scalable, Lightweight, FTL-aware Key-Value Store. In USENIX Annual Technical Conference (ATC). 207--219.Google Scholar
- Patrick O'Neil, Edward Cheng, Dieter Gawlick, and Elizabeth O'Neil. 1996. The log-structured merge-tree (LSM-tree). Acta Informatica 33, 4 (1996), 351--385.Google ScholarDigital Library
- Anastasios Papagiannis, Giorgos Saloustros, Pilar González-Férez, and Angelos Bilas. 2016. Tucana: Design and implementation of a fast and efficient scale-up key-value store. In Proceedings of USENIX Annual Technical Conference (ATC). 537--550.Google Scholar
- Pure Storage FlashBlade. [n.d.]. https://purestorage.com/.Google Scholar
- Pandian Raju, Rohan Kadekodi, Vijay Chidambaram, and Ittai Abraham. 2017. PebblesDB: Building Key-Value Stores Using Fragmented Log-Structured Merge Trees. In Proceedings of the Symposium on Operating Systems Principles (SOSP). 497--514.Google ScholarDigital Library
- Kai Ren, Qing Zheng, Joy Arulraj, and Garth Gibson. 2017. SlimDB: A space-efficient key-value storage engine for semi-sorted data. Proceedings of the VLDB Endowment 10, 13 (2017), 2037--2048.Google ScholarDigital Library
- RocksDB. [n.d.]. https://github.com/facebook/rocksdb.Google Scholar
- Ohad Rodeh, Josef Bacik, and Chris Mason. 2013. BTRFS: The Linux B-tree filesystem. ACM Transactions on Storage (TOS) 9, 3 (2013), 1--32.Google ScholarDigital Library
- ScaleFlux Computational Storage. [n.d.]. http://scaleflux.com.Google Scholar
- Pradeep J Shetty, Richard P Spillane, Ravikant R Malpani, Binesh Andrews, Justin Seyster, and Erez Zadok. 2013. Building workload-independent storage with VT-trees. In Proceedings of USENIX Conference on File and Storage Technologies (FAST). 17--30.Google Scholar
- WiredTiger. [n.d.]. https://github.com/wiredtiger/.Google Scholar
- Xingbo Wu, Yuehai Xu, Zili Shao, and Song Jiang. 2015. LSM-trie: An LSM-tree-based ultra-large key-value store for small data items. In Proceedings of USENIX Annual Technical Conference (ATC). 71--82.Google Scholar
- Yinliang Yue, Bingsheng He, Yuzhe Li, and Weiping Wang. 2016. Building an efficient put-intensive key-value store with skip-tree. IEEE Transactions on Parallel and Distributed Systems 28, 4 (2016), 961--973.Google ScholarDigital Library
Recommendations
HPDA: A hybrid parity-based disk array for enhanced performance and reliability
Flash-based Solid State Drive (SSD) has been productively shipped and deployed in large scale storage systems. However, a single flash-based SSD cannot satisfy the capacity, performance and reliability requirements of the modern storage systems that ...
Higher reliability redundant disk arrays: Organization, operation, and coding
Parity is a popular form of data protection in redundant arrays of inexpensive/independent disks (RAID). RAID5 dedicates one out of N disks to parity to mask single disk failures, that is, the contents of a block on a failed disk can be reconstructed by ...
A multiple-file write scheme for improving write performance of small files in Fast File System
Fast File System (FFS) stores files to disk in separate disk writes, each of which incurs a disk positioning (seek + rotation) limiting the write performance for small files. We propose a new scheme called co-writing to accelerate small file writes in ...
Comments