skip to main content
10.1145/3465998.3466004acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article
Open Access

KallaxDB: A Table-less Hash-based Key-Value Store on Storage Hardware with Built-in Transparent Compression

Published:20 June 2021Publication History

ABSTRACT

This paper studies the design of a key-value (KV) store that can take full advantage of modern storage hardware with built-in transparent compression capability. Many modern storage appliances/drives implement hardware-based data compression, transparent to OS and applications. Moreover, the growing deployment of hardware-based compression in Cloud infrastructure leads to the imminent arrival of Cloud-based storage hardware with built-in transparent compression. By decoupling the logical storage space utilization efficiency from the true physical storage usage, transparent compression allows data management software to purposely waste logical storage space in return for simpler data structures and algorithms, leading to lower implementation complexity and higher performance. This work proposes a table-less hash-based KV store, where the basic idea is to hash the key space directly onto the logical storage space without using a hash table at all. With a substantially simplified data structure, this approach is subject to significant logical storage space under-utilization, which can be seamlessly mitigated by storage hardware with transparent compression. This paper presents the basic KV store architecture, and develops mathematical formulations to assist its configuration and analysis. We implemented such a KV store KallaxDB and carried out experiments on a commercial SSD with built-in transparent compression. The results show that, while consuming very little memory resource, it compares favorably with the other modern KV stores in terms of throughput, latency, and CPU usage.

References

  1. Apache Cassandra. [n.d.]. http://cassandra.apache.org/.Google ScholarGoogle Scholar
  2. AWS Graviton Processor. [n.d.]. https://aws.amazon.com/ec2/graviton/.Google ScholarGoogle Scholar
  3. Oana Balmau, Diego Didona, Rachid Guerraoui, Willy Zwaenepoel, Huapeng Yuan, Aashray Arora, Karan Gupta, and Pavan Konka. 2017. TRIAD: Creating Synergies Between Memory, Disk and Log in Log Structured Key-Value Stores. In Proceedings of USENIX Annual Technical Conference (ATC). 363--375.Google ScholarGoogle Scholar
  4. Jeff Bonwick, Matt Ahrens, Val Henson, Mark Maybee, and Mark Shellenbaum. 2003. The zettabyte file system. In Proceedings of the Usenix Conference on File and Storage Technologies (FAST), Vol. 215.Google ScholarGoogle Scholar
  5. Zhichao Cao, Siying Dong, Sagar Vemuri, and David Du. 2020. Characterizing, modeling, and benchmarking RocksDB key-value workloads at Facebook. In USENIX Conference on File and Storage Technologies (FAST). 209--223.Google ScholarGoogle Scholar
  6. Helen HW Chan, Chieh-Jan Mike Liang, Yongkun Li, Wenjia He, Patrick PC Lee, Lianjie Zhu, Yaozu Dong, Yinlong Xu, Yu Xu, Jin Jiang, et al. 2018. HashKV: Enabling Efficient Updates in KV Storage via Hashing. In Proceedings of USENIX Annual Technical Conference (ATC). 1007--1019.Google ScholarGoogle Scholar
  7. Derek Chiou, Eric Chung, and Susan Carrie. 2019. (Cloud) Acceleration at Microsoft. Tutorial at Hot Chips (2019).Google ScholarGoogle Scholar
  8. Brian Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the ACM Symposium on Cloud Computing (SoCC). ACM, 143--154.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Niv Dayan and Stratos Idreos. 2018. Dostoevsky: Better space-time trade-offs for LSM-tree based key-value stores via adaptive removal of superfluous merging. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, 505--520.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Biplob Debnath, Sudipta Sengupta, and Jin Li. 2010. FlashStore: high throughput persistent key-value store. Proceedings of the VLDB Endowment 3, 1-2 (2010), 1414--1425.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Biplob Debnath, Sudipta Sengupta, and Jin Li. 2011. SkimpyStash: RAM space skimpy key-value store on flash-based storage. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data. 25--36.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Dell EMC PowerMax. [n.d.]. https://delltechnologies.com/.Google ScholarGoogle Scholar
  13. E. F. Haratsch. 2019. SSD with Compression: Implementation, Interface and Use Case. In Flash Memory Summit.Google ScholarGoogle Scholar
  14. HPE Nimble Storage. [n.d.]. https://www.hpe.com/.Google ScholarGoogle Scholar
  15. Gui Huang, Xuntao Cheng, Jianying Wang, Yujie Wang, Dengcheng He, Tieying Zhang, Feifei Li, Sheng Wang, Wei Cao, and Qiang Li. 2019. X-Engine: An optimized storage engine for large-scale E-commerce transaction processing. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, 651--665.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Kornilios Kourtis, Nikolas Ioannou, and Ioannis Koltsidas. 2019. Reaping the performance of fast NVM storage with uDepot. In USENIX Conference on File and Storage Technologies (FAST). 1--15.Google ScholarGoogle Scholar
  17. Harold J Larson. 1995. Introduction to Probability. Addison-Wesley, Reading, MA.Google ScholarGoogle Scholar
  18. Baptiste Lepers, Oana Balmau, Karan Gupta, and Willy Zwaenepoel. 2019. KVell: the design and implementation of a fast persistent key-value store. In Proceedings of the ACM Symposium on Operating Systems Principles (SOSP). 447--461.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Hyeontaek Lim, Bin Fan, David G Andersen, and Michael Kaminsky. 2011. SILT: A memory-efficient, high-performance key-value store. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles. 1--13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Lanyue Lu, Thanumalayan Sankaranarayana Pillai, Hariharan Gopalakrishnan, Andrea C Arpaci-Dusseau, and Remzi H Arpaci-Dusseau. 2017. WiscKey: Separating keys from values in SSD-conscious storage. ACM Transactions on Storage (TOS) 13, 1 (2017), 5.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. C. Luo and M.J. Carey. 2020. LSM-based storage techniques: a survey. The VLDB Journal 29 (2020), 393--418.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Leonardo Marmol, Swaminathan Sundararaman, Nisha Talagala, and Raju Rangaswami. 2015. NVMKV: A Scalable, Lightweight, FTL-aware Key-Value Store. In USENIX Annual Technical Conference (ATC). 207--219.Google ScholarGoogle Scholar
  23. Patrick O'Neil, Edward Cheng, Dieter Gawlick, and Elizabeth O'Neil. 1996. The log-structured merge-tree (LSM-tree). Acta Informatica 33, 4 (1996), 351--385.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Anastasios Papagiannis, Giorgos Saloustros, Pilar González-Férez, and Angelos Bilas. 2016. Tucana: Design and implementation of a fast and efficient scale-up key-value store. In Proceedings of USENIX Annual Technical Conference (ATC). 537--550.Google ScholarGoogle Scholar
  25. Pure Storage FlashBlade. [n.d.]. https://purestorage.com/.Google ScholarGoogle Scholar
  26. Pandian Raju, Rohan Kadekodi, Vijay Chidambaram, and Ittai Abraham. 2017. PebblesDB: Building Key-Value Stores Using Fragmented Log-Structured Merge Trees. In Proceedings of the Symposium on Operating Systems Principles (SOSP). 497--514.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Kai Ren, Qing Zheng, Joy Arulraj, and Garth Gibson. 2017. SlimDB: A space-efficient key-value storage engine for semi-sorted data. Proceedings of the VLDB Endowment 10, 13 (2017), 2037--2048.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. RocksDB. [n.d.]. https://github.com/facebook/rocksdb.Google ScholarGoogle Scholar
  29. Ohad Rodeh, Josef Bacik, and Chris Mason. 2013. BTRFS: The Linux B-tree filesystem. ACM Transactions on Storage (TOS) 9, 3 (2013), 1--32.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. ScaleFlux Computational Storage. [n.d.]. http://scaleflux.com.Google ScholarGoogle Scholar
  31. Pradeep J Shetty, Richard P Spillane, Ravikant R Malpani, Binesh Andrews, Justin Seyster, and Erez Zadok. 2013. Building workload-independent storage with VT-trees. In Proceedings of USENIX Conference on File and Storage Technologies (FAST). 17--30.Google ScholarGoogle Scholar
  32. WiredTiger. [n.d.]. https://github.com/wiredtiger/.Google ScholarGoogle Scholar
  33. Xingbo Wu, Yuehai Xu, Zili Shao, and Song Jiang. 2015. LSM-trie: An LSM-tree-based ultra-large key-value store for small data items. In Proceedings of USENIX Annual Technical Conference (ATC). 71--82.Google ScholarGoogle Scholar
  34. Yinliang Yue, Bingsheng He, Yuzhe Li, and Weiping Wang. 2016. Building an efficient put-intensive key-value store with skip-tree. IEEE Transactions on Parallel and Distributed Systems 28, 4 (2016), 961--973.Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    DAMON '21: Proceedings of the 17th International Workshop on Data Management on New Hardware
    June 2021
    104 pages
    ISBN:9781450385565
    DOI:10.1145/3465998

    Copyright © 2021 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 20 June 2021

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    DAMON '21 Paper Acceptance Rate15of17submissions,88%Overall Acceptance Rate80of102submissions,78%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader