ABSTRACT
Key-value stores which keep the data entirely in main memory can serve applications whose performance criteria cannot be met by disk-based key-value stores. This paper evaluates the performance implications of cache-conscious data placement in an in-memory key-value store by examining how many values have to be stored consecutively in blocks in order to fully exploit memory locality during bandwidth-bound operations. We contribute by introducing a random block traversal main memory access pattern, by describing the corresponding memory access costs as well as by formally and experimentally deriving the correlation between block size and throughput. Our calculations and experiments vary the value and block sizes as well as their placement in the memory and derive their impact on cache-misses throughout the different memory hierarchies, the ability to prefetch data, and the number of needed CPU cycles to perform a certain set of data operations. The paper closes with the insight that a block-wise grouping of relatively few key-value pairs increases the throughput up to a factor six and with a discussion which implications a block-wise grouping of data has on the system design key-value store.
- A. Ailamaki, D. J. DeWitt, M. D. Hill, and D. A. Wood. Dbmss on a modern processor: Where does time go? In M. P. Atkinson, M. E. Orlowska, P. Valduriez, S. B. Zdonik, and M. L. Brodie, editors, VLDB, pages 266--277. Morgan Kaufmann, 1999. Google ScholarDigital Library
- F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: a distributed storage system for structured data. In OSDI '06: Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation, pages 15--15, Berkeley, CA, USA, 2006. USENIX Association. Google ScholarDigital Library
- Danga Interactive. memcached - a distributed memory object caching system. http://www.memcached.org/, Accessed on Nov 28 2010.Google Scholar
- J. Dean. Designs, Lessons and Advice from Building Large Distributed Systems. LADIS 2009 keynote. Available online at http://www.cs.cornell.edu/projects/ladis2009/talks/dean-keynoteladis2009.pdf, 2009.Google Scholar
- G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: amazon's highly available key-value store. In Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles, SOSP '07, pages 205--220, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- D. J. DeWitt, R. H. Katz, F. Olken, L. D. Shapiro, M. R. Stonebraker, and D. A. Wood. Implementation techniques for main memory database systems. In Proceedings of the 1984 ACM SIGMOD international conference on Management of data, SIGMOD '84, pages 1--8, New York, NY, USA, 1984. ACM. Google ScholarDigital Library
- U. Drepper. What Every Programmer Should Know About Memory. http://lwn.net/Articles/250967/, Accessed on Nov 28 2010.Google Scholar
- A. Lakshman and P. Malik. Cassandra: structured storage system on a p2p network. In Proceedings of the 28th ACM symposium on Principles of distributed computing, PODC '09, pages 5--5, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- W. Lin, S. K. Reinhardt, and D. Burger. Reducing dram latencies with an integrated memory hierarchy design. In Proceedings of the 7th International Symposium on High-Performance Computer Architecture, HPCA '01, page 301, Washington, DC, USA, 2001. IEEE Computer Society. Google ScholarDigital Library
- S. Manegold. The Calibrator (v0.9e), a Cache-Memory and TLB Calibration Tool. http://homepages.cwi.nl/manegold/Calibrator/, Accessed on Nov 14 2010.Google Scholar
- S. Manegold, P. Boncz, and M. L. Kersten. Generic database cost models for hierarchical memory systems. In Proceedings of the 28th international conference on Very Large Data Bases, VLDB '02, pages 191--202. VLDB Endowment, 2002. Google ScholarDigital Library
- S. Manegold, P. Boncz, and M. L. Kersten. Generic database cost models for hierarchical memory systems. Technical Report INS-R0203, CWI, Amsterdam, The Netherlands, March 2002. Google ScholarDigital Library
- J. K. Ousterhout, P. Agrawal, D. Erickson, C. Kozyrakis, J. Leverich, D. Mazières, S. Mitra, A. Narayanan, M. Rosenblum, S. M. Rumble, E. Stratmann, and R. Stutsman. The case for ramclouds: Scalable high-performance storage entirely in dram. In SIGOPS OSR. Stanford InfoLab, 2009. Google ScholarDigital Library
- H. Pirk. Cache conscious data layouting for in-memory databases. Master's thesis, Humboldt-Universität zu Berlin, Germany, 2010.Google Scholar
- S. Sanfilippo. redis - a persistent key-value database with built-in net interface written in ansi-c for posix systems. http://code.google.com/p/redis/, Accessed on Nov 28 2010.Google Scholar
- R. Schöne, W. E. Nagel, and S. Pflüger. Analyzing cache bandwidth on the intel core 2 architecture. In C. H. Bischof, H. M. Bücker, P. Gibbon, G. R. Joubert, T. Lippert, B. Mohr, and F. J. Peters, editors, PARCO, volume 15 of Advances in Parallel Computing, pages 365--372. IOS Press, 2007.Google Scholar
- B. Trushkowsky, P. Bodik, A. Fox, M. J. Franklin, M. I. Jordan, and D. A. Patterson. The scads director: Scaling a distributed storage system under stringent performance requirements. In to be appear in Proceeding of FAST 2011 - 9th USENIX Conference on File and Storage Technologies, 2011. Google ScholarDigital Library
Index Terms
- Cache-conscious data placement in an in-memory key-value store
Recommendations
LSM-tree managed storage for large-scale key-value store
SoCC '17: Proceedings of the 2017 Symposium on Cloud ComputingKey-value stores are increasingly adopting LSM-trees as their enabling data structure in the backend storage, and persisting their clustered data through a file system. A file system is expected to not only provide file/directory abstraction to organize ...
NVLSM: A Persistent Memory Key-Value Store Using Log-Structured Merge Tree with Accumulative Compaction
Computer systems utilizing byte-addressable Non-Volatile Memory (NVM) as memory/storage can provide low-latency data persistence. The widely used key-value stores using Log-Structured Merge Tree (LSM-Tree) are still beneficial for NVM systems in aspects ...
Exploring storage class memory with key value stores
INFLOW '13: Proceedings of the 1st Workshop on Interactions of NVM/FLASH with Operating Systems and WorkloadsIn the near future, new storage-class memory (SCM) technologies -- such as phase-change memory and memristors -- will radically change the nature of long-term storage. These devices will be cheap, non-volatile, byte addressable, and near DRAM density ...
Comments