skip to main content
research-article
Public Access

WiscKey: Separating Keys from Values in SSD-Conscious Storage

Published:02 March 2017Publication History
Skip Abstract Section

Abstract

We present WiscKey, a persistent LSM-tree-based key-value store with a performance-oriented data layout that separates keys from values to minimize I/O amplification. The design of WiscKey is highly SSD optimized, leveraging both the sequential and random performance characteristics of the device. We demonstrate the advantages of WiscKey with both microbenchmarks and YCSB workloads. Microbenchmark results show that WiscKey is 2.5 × to 111 × faster than LevelDB for loading a database (with significantly better tail latencies) and 1.6 × to 14 × faster for random lookups. WiscKey is faster than both LevelDB and RocksDB in all six YCSB workloads.

References

  1. Jung-Sang Ahn, Chiyoung Seo, Ravi Mayuram, Rahim Yaseen, Jin-Soo Kim, and Seungryoul Maeng. 2016. ForestDB: A fast key-value storage system for variable-length string keys. IEEE Trans. Comput. 65, 3 (March 2016), 902--915. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Ramnatthan Alagappan, Aishwarya Ganesan, Yuvraj Patel, Thanumalayan Sankaranarayana Pillai, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2016. Correlated crash vulnerabilities. In Proceedings of the 12th Symposium on Operating Systems Design and Implementation (OSDI’16). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Ashok Anand, Chitra Muthukrishnan, Steven Kappes, Aditya Akella, and Suman Nath. 2010. Cheap and large cams for high-performance data-intensive networked systems. In Proceedings of the 7th Symposium on Networked Systems Design and Implementation (NSDI’10). Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. David Andersen, Jason Franklin, Michael Kaminsky, Amar Phanishayee, Lawrence Tan, and Vijay Vasudevan. 2009. FAWN: A fast array of wimpy nodes. In Proceedings of the 22nd ACM Symposium on Operating Systems Principles (SOSP’09). Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Timothy G. Armstrong, Vamsi Ponnekanti, Dhruba Borthakur, and Mark Callaghan. 2013. LinkBench: A database benchmark based on the Facebook social graph. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD’13). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Remzi H. Arpaci-Dusseau and Andrea C. Arpaci-Dusseau. 2014. Operating Systems: Three Easy Pieces (0.9 ed.). Arpaci-Dusseau Books.Google ScholarGoogle Scholar
  7. Manos Athanassoulis, Michael S. Kester, Lukas M. Maas, Radu Stoica, Stratos Idreos, Anastasia Ailamaki, and Mark Callaghan. 2015. Designing Access Methods: The RUM Conjecture. In The 19th International Conference on Extending Database Technology (EDBT’15).Google ScholarGoogle Scholar
  8. Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2015. Workload analysis of a large-scale key-value store. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’15).Google ScholarGoogle Scholar
  9. Radheshyam Balasundaram, Igor Canadi, Yueh-Hsuan Chiang, Siying Dong, Leonidas Galanis, Lei Jin, Xing Jin, Venkatesh Radhakrishnan, Dmitri Smirnov, Yi Wu, and Feng Zhu. 2015. RocksDB Blog. Retrieved from http://rocksdb.org/blog/.Google ScholarGoogle Scholar
  10. Doug Beaver, Sanjeev Kumar, Harry C. Li, Jason Sobel, and Peter Vajgel. 2010. Finding a needle in haystack: Facebook’s photo storage. In Proceedings of the 9th Symposium on Operating Systems Design and Implementation (OSDI’10). Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Michael A. Bender, Martin Farach-Colton, Jeremy T. Fineman, Yonatan Fogel, Bradley Kuszmaul, and Jelani Nelson. 2007. Cache-oblivious streaming b-trees. In Proceedings of the 19th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA’07). Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Adam L. Buchsbaum, Michael Goldwasser, Suresh Venkatasubramanian, and Jeffery R. Westbrook. 2000. On external memory graph traversal. In Proceedings of the 11th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’00). Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Adrian M. Caulfield, Arup De, Joel Coburn, Todor I. Mollow, Rajesh K. Gupta, and Steven Swanson. 2010. Moneta: A high-performance storage array architecture for next-generation, non-volatile memories. In Proceedings of the 43nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’10). Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Michael Burrows, Tushar Chandra, Andrew Fikes, and Robert Gruber. 2006. Bigtable: A distributed storage system for structured data. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI’06). 205--218. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Feng Chen, Rubao Lee, and Xiaodong Zhang. 2011. Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing. In Proceedings of the 17th International Symposium on High Performance Computer Architecture (HPCA’11). Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Jianjun Chen, Chris Douglas, Michi Mutsuzaki, Patrick Quaid, Raghu Ramakrishnan, Sriram Rao, and Russell Sears. 2012. Walnut: A unified cloud object store. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. John Colgrove, John D. Davis, John Hayes, Ethan L. Miller, Cary Sandvig, Russell Sears, Ari Tamches, Neil Vachharajani, and Feng Wang. 2015. Purity: Building fast, highly-available enterprise flash storage from commodity components. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD’15). Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, and Ramana Yerneni. 2008. PNUTS: Yahoo!s hosted data serving platform. In Proceedings of the VLDB Endowment (PVLDB’08). Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the ACM Symposium on Cloud Computing (SOCC’10). Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Biplob Debnath, Sudipta Sengupta, and Jin Li. 2010. FlashStore: High throughput persistent key-value store. In Proceedings of the 36th International Conference on Very Large Databases (VLDB’10).Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Biplob Debnath, Sudipta Sengupta, and Jin Li. 2011. SkimpyStash: RAM space skimpy key-value store on flash-based storage. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data (SIGMOD’11). Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Guiseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swami Sivasubramanian, Peter Vosshall, and Werner Vogels. 2007. Dynamo: Amazon’s highly available key-value store. In Proceedings of the 21st ACM Symposium on Operating Systems Principles (SOSP’07). Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Siying Dong. 2015. RocksDB: Challenges of LSM-Trees in Practice. http://www.hpts.ws/papers/2015/rocksdb_hpts_website.pdf.Google ScholarGoogle Scholar
  24. Facebook. 2015. RocksDB 2015 H2 Roadmap. http://rocksdb.org/blog/2015/rocksdb-2015-h2-roadmap/.Google ScholarGoogle Scholar
  25. Bin Fan, David G. Andersen, and Michael Kaminsky. 2013. MemC3: Compact and concurrent MemCache with dumber caching and smarter hashing. In Proceedings of the 10th Symposium on Networked Systems Design and Implementation (NSDI’13). Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Fusion-I. O. 2015. Fusion-I. O. ioDrive2. Retrieved from http://www.fusionio.com/products/iodrive2.Google ScholarGoogle Scholar
  27. Lars George. 2011. HBase: The Definitive Guide. O’Reilly.Google ScholarGoogle Scholar
  28. Sanjay Ghemawat and jeff Dean. 2011. LevelDB. Retrieved from http://code.google.com/p/leveldb.Google ScholarGoogle Scholar
  29. Guy Golan-Gueta, Edward Bortnikov, Eshcar Hillel, and Idit Keidar. 2015. Scaling concurrent log-structured data stores. In Proceedings of the EuroSys Conference (EuroSys’15). Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Tyler Harter, Dhruba Borthakur, Siying Dong, Amitanand Aiyer, Liyin Tang, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2014. Analysis of HDFS under HBase: A Facebook messages case study. In Proceedings of the 12th USENIX Symposium on File and Storage Technologies (FAST’14). Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. William Jannen, Jun Yuan, Yang Zhan, Amogh Akshintala, John Esmet, Yizheng Jiao, Ankur Mittal, Prashant Pandey, Phaneendra Reddy, Leif Walsh, Michael Bender, Martin Farach-Colton, Rob Johnson, Bradley C. Kuszmaul, and Donald E. Porter. 2015. BetrFS: A right-optimized write-optimized file system. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Hyojun Kim, Nitin Agrawal, and Cristian Ungureanu. 2012. Revisiting storage for smartphones. In Proceedings of the 10th USENIX Symposium on File and Storage Technologies (FAST’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Chunbo Lai, Song Jiang, Liqiong Yang, Shiding Lin, Guangyu Sun, Zhenyu Hou, Can Cui, and Jason Cong. 2015. Atlas: Baidus key-value storage system for cloud data. In Proceedings of the 31st IEEE Conference on Massive Data Storage (MSST’15).Google ScholarGoogle ScholarCross RefCross Ref
  34. Avinash Lakshman and Prashant Malik. 2009. Cassandra -- a decentralized structured storage system. In The 3rd ACM SIGOPS International Workshop on Large Scale Distributed Systems and Middleware.Google ScholarGoogle Scholar
  35. Changman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho. 2015. F2FS: A new file system for flash storage. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Hyeontaek Lim, Bin Fan, David G. Andersen, and Michael Kaminsky. 2011. SILT: A memory-efficient, high-performance key-value store. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP’11). Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Hyeontaek Lim, Dongsu Han, David G. Andersen, and Michael Kaminsky. 2014. MICA: A holistic approach to fast in-memory key-value storage. In Proceedings of the 11th Symposium on Networked Systems Design and Implementation (NSDI’14). Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Lanyue Lu, Thanumalayan Sankaranarayana Pillai, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2016. WiscKey: Separating keys from values in SSD-conscious storage. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16). Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Haohui Mai and Jing Zhao. 2015. Scaling HDFS to manage billions of files with key value stores. In The 8th Annual Hadoop Summit.Google ScholarGoogle Scholar
  40. Yandong Mao, Eddie Kohler, and Robert Morris. 2012. Cache craftiness for fast multicore key-value storage. In Proceedings of the EuroSys Conference (EuroSys’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Leonardo Marmol, Swaminathan Sundararaman, Nisha Talagala, and Raju Rangaswami. 2015. NVMKV: A scalable, lightweight, ftl-aware key-value store. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’15). Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Changwoo Min, Kangnyeon Kim, Hyunjin Cho, Sang-Won Lee, and Young Ik Eom. 2012. SFS: Random write considered harmful in solid state drives. In Proceedings of the 10th USENIX Symposium on File and Storage Technologies (FAST’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Rajesh Nishtala, Hans Fugal, Steven Grimm, Marc Kwiatkowski, Herman Lee, Harry C. Li, Ryan McElroy, Mike Paleczny, Daniel Peek, Paul Saab, David Stafford, Tony Tung, and Venkateshwaran Venkataramani. 2013. Scaling memcache at facebook. In Proceedings of the 10th Symposium on Networked Systems Design and Implementation (NSDI’13). Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Chris Nyberg, Tom Barclay, Zarka Cvetanovic, Jim Gray, and Dave Lomet. 1994. AlphaSort: A RISC machine sort. In Proceedings of the 1994 ACM SIGMOD International Conference on Management of Data (SIGMOD’94). Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Patrick ONeil, Edward Cheng, Dieter Gawlick, and Elizabeth ONeil. 1996. The log-structured merge-tree (LSM-tree). Acta Inform. 33, 4 (1996), 351--385. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Simon Peter, Jialin Li, Irene Zhang, Dan R. K. Ports, Doug Woos, Arvind Krishnamurthy, and Thomas Anderson. 2014. Arrakis: The operating system is the control plane. In Proceedings of the 11th Symposium on Operating Systems Design and Implementation (OSDI’14). Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Thanumalayan Sankaranarayana Pillai, Vijay Chidambaram, Ramnatthan Alagappan, Samer Al-Kiswany, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2014. All file systems are not created equal: On the complexity of crafting crash-consistent applications. In Proceedings of the 11th Symposium on Operating Systems Design and Implementation (OSDI’14). Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Eric Redmond. 2013. A Little Riak Book. Retrieved from http://www.littleriakbook.com.Google ScholarGoogle Scholar
  49. Kai Ren and Garth Gibson. 2013. TABLEFS: Enhancing metadata efciency in the local file system. In Proceedings of the USENIX Annual Technical Conference (USENIX’13). Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Kai Ren, Qing Zheng, Swapnil Patil, and Garth Gibson. 2014. IndexFS: Scaling file system metadata performance with stateless caching and bulk insertion. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC’14). Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Russell Sears and Raghu Ramakrishnan. 2012. bLSM: A general purpose log structured merge tree. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Pradeep Shetty, Richard Spillane, Ravikant Malpani, Binesh Andrews, Justin Seyster, and Erez Zadok. 2013. Building workload-independent storage with VT-trees. In Proceedings of the 11th USENIX Symposium on File and Storage Technologies (FAST’13). Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Roshan Sumbaly, Jay Kreps, Lei Gao, Alex Feinberg, Chinmay Soman, and Sam Shah. 2012. Serving large-scale batch computed data with project Voldemort. In Proceedings of the 10th USENIX Symposium on File and Storage Technologies (FAST’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Vijay Vasudevan, Michael Kaminsky, and David G. Andersen. 2012. Using vector interfaces to deliver millions of iops from a networked key-value storage server. In Proceedings of the ACM Symposium on Cloud Computing (SOCC’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Peng Wang, Guangyu Sun, Song Jiang, Jian Ouyang, Shiding Lin, Chen Zhang, and Jason Cong. 2014. An efficient design and implementation of LSM-tree based key-value store on open-channel SSD. In Proceedings of the EuroSys Conference (EuroSys’14). Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Xingbo Wu, Yuehai Xu, Zili Shao, and Song Jiang. 2015. LSM-trie: An LSM-tree-based ultra-large key-value store for small data. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’15). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. WiscKey: Separating Keys from Values in SSD-Conscious Storage

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Storage
          ACM Transactions on Storage  Volume 13, Issue 1
          Special Issue on USENIX FAST 2016 and Regular Papers
          February 2017
          201 pages
          ISSN:1553-3077
          EISSN:1553-3093
          DOI:10.1145/3054178
          • Editor:
          • Sam H. Noh
          Issue’s Table of Contents

          Copyright © 2017 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 2 March 2017
          • Received: 1 December 2016
          • Accepted: 1 December 2016
          Published in tos Volume 13, Issue 1

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader