research-article

S3: a scalable in-memory skip-list index for key-value store

Authors:
Jingtian Zhang

Zhejiang University, Hangzhou, Zhejiang, China

Zhejiang University, Hangzhou, Zhejiang, China
View Profile

,
Sai Wu

Zhejiang University, Hangzhou, Zhejiang, China

Zhejiang University, Hangzhou, Zhejiang, China
View Profile

,
Zeyuan Tan

Zhejiang University, Hangzhou, Zhejiang, China

Zhejiang University, Hangzhou, Zhejiang, China
View Profile

,
Gang Chen

Zhejiang University, Hangzhou, Zhejiang, China

Zhejiang University, Hangzhou, Zhejiang, China
View Profile

,
Zhushi Cheng

Alibaba Group, Hangzhou, Zhejiang, China

Alibaba Group, Hangzhou, Zhejiang, China
View Profile

,
Wei Cao

Alibaba Group, Hangzhou, Zhejiang, China

Alibaba Group, Hangzhou, Zhejiang, China
View Profile

,
Yusong Gao

Alibaba Group, Hangzhou, Zhejiang, China

Alibaba Group, Hangzhou, Zhejiang, China
View Profile

,
Xiaojie Feng

Alibaba Group, Hangzhou, Zhejiang, China

Alibaba Group, Hangzhou, Zhejiang, China
View Profile

Proceedings of the VLDB Endowment Volume 12 Issue 12pp 2183–2194https://doi.org/10.14778/3352063.3352134

Published:01 August 2019Publication History

Proceedings of the VLDB Endowment

Abstract

Many new memory indexing structures have been proposed and outperform current in-memory skip-list index adopted by LevelDB, RocksDB and other key-value systems. However, those new indexes cannot be easily intergrated with key-value systems, because most of them do not consider how the data can be efficiently flushed to disk. Some assumptions, such as fixed size key and value, are unrealistic for real applications. In this paper, we present S3, a scalable in-memory skip-list index for the customized version of RocksDB in Alibaba Cloud. S3 adopts a two-layer structure. In the top layer, a cache-sensitive structure is used to maintain a few guard entries to facilitate the search over the skip-list. In the bottom layer, a semi-ordered skip-list index is built to support highly concurrent insertions and fast lookup and range query. To further improve the performance, we train a neural model to select guard entries intelligently according to the data distribution and query distribution. Experiments on multiple datasets show that S3 achieves a comparable performance to other new memory indexing schemes, and can replace current in-memory skip-list of LevelDB and RocksDB to support huge volume of data.

References

Apache hbase. http://hbase.apache.org/.Google Scholar
Bw-tree. https://github.com/wangziqi2013/bwtree.Google Scholar
Cicada. https://github.com/efficient/cicada-engine.Google Scholar
Clht. https://github.com/lpd-epfl/clht.Google Scholar
Cockroachdb. https://github.com/cockroachdb/cockroach.Google Scholar
Hyperleveldb. https://github.com/rescrv/hyperleveldb.Google Scholar
Judy arrays. http://judy.sourceforge.net/.Google Scholar
Leveldb. http://ccrma.stanford.edu/jos/bayes/bayes.html.Google Scholar
Masstree. https://github.com/kohler/masstree-beta.Google Scholar
Mongodb. https://www.mongodb.com.Google Scholar
Peloton. https://github.com/cmu-db/peloton.Google Scholar
Redis. https://redis.io/.Google Scholar
Rocksdb. http://rocksdb.org.Google Scholar
Search results apache flink: Scalable stream and batch data processing. https://flink.apache.org.Google Scholar
Yahoo! cloud serving benchmark in c++. https://github.com/basicthinker/ycsb-c.Google Scholar
V. Alvarez, S. Richter, X. Chen, and J. Dittrich. A comparison of adaptive radix trees and hash tables. In ICDE, pages 1227--1238, 2015.Google ScholarCross Ref
O. Balmau, R. Guerraoui, V. Trigonakis, and I. Zablotchi. Flodb: Unlocking memory in persistent key-value stores. In EuroSys, pages 80--94, 2017. Google ScholarDigital Library
S. Chen, P. B. Gibbons, and T. C. Mowry. Improving index performance through prefetching. In SIGMOD, pages 235--246, 2001. Google ScholarDigital Library
B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears. Benchmarking cloud serving systems with YCSB. In SoCC, pages 143--154, 2010. Google ScholarDigital Library
T. David, R. Guerraoui, and V. Trigonakis. Asynchronized concurrency: The secret to scaling concurrent search data structures. In ASPLOS, pages 631--644, 2015. Google ScholarDigital Library
G. Golan-Gueta, E. Bortnikov, E. Hillel, and I. Keidar. Scaling concurrent log-structured data stores. In EuroSys, pages 32:1--32:14, 2015. Google ScholarDigital Library
R. A. Hankins and J. M. Patel. Effect of node size on the performance of cache-conscious b<sup>+</sup>-trees. In SIGMETRICS, pages 283--294, 2003. Google ScholarDigital Library
S. Hochreiter. The vanishing gradient problem during learning recurrent neural nets and problem solutions. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 6(2):107--116, 1998. Google ScholarDigital Library
S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Computation, 9(8):1735--1780, 1997. Google ScholarDigital Library
C. Kim, J. Chhugani, N. Satish, E. Sedlar, A. D. Nguyen, T. Kaldewey, V. W. Lee, S. A. Brandt, and P. Dubey. FAST: fast architecture sensitive tree search on modern cpus and gpus. In SIGMOD, pages 339--350, 2010. Google ScholarDigital Library
V. Leis, A. Kemper, and T. Neumann. The adaptive radix tree: Artful indexing for main-memory databases. In ICDE, pages 38--49, 2013. Google ScholarDigital Library
J. J. Levandoski, D. B. Lomet, and S. Sengupta. The bw-tree: A b-tree for new hardware platforms. In ICDE, pages 302--313, 2013. Google ScholarDigital Library
H. Lim, M. Kaminsky, and D. G. Andersen. Cicada: Dependably fast multi-core in-memory transactions. In SIGMOD, pages 21--35, 2017. Google ScholarDigital Library
L. Lu, T. S. Pillai, H. Gopalakrishnan, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. Wisckey: Separating keys from values in ssd-conscious storage. TOS, 13(1):5:1--5:28, 2017. Google ScholarDigital Library
Y. Mao, E. Kohler, and R. T. Morris. Cache craftiness for fast multicore key-value storage. In EuroSys, pages 183--196, 2012. Google ScholarDigital Library
P. E. O'Neil, E. Cheng, D. Gawlick, and E. J. O'Neil. The log-structured merge-tree (lsm-tree). Acta Inf., 33(4):351--385, 1996. Google ScholarDigital Library
W. Pugh. Skip lists: A probabilistic alternative to balanced trees. Commun. ACM, 33(6):668--676, 1990. Google ScholarDigital Library
P. Raju, R. Kadekodi, V. Chidambaram, and I. Abraham. Pebblesdb: Building key-value stores using fragmented log-structured merge trees. In SOSP, pages 497--514, 2017. Google ScholarDigital Library
J. Rao and K. A. Ross. Cache conscious indexing for decision-support in main memory. In VLDB, pages 78--89, 1999. Google ScholarDigital Library
J. Rao and K. A. Ross. Making b+-trees cache conscious in main memory. In SIGMOD, pages 475--486, 2000. Google ScholarDigital Library
J. Sewall, J. Chhugani, C. Kim, N. Satish, and P. Dubey. PALM: parallel architecture-friendly latch-free modifications to B+ trees on many-core processors. PVLDB, 4(11):795--806, 2011.Google ScholarDigital Library
S. Sprenger, S. Zeuch, and U. Leser. Cache-sensitive skip list: Efficient range queries on modern cpus. In IMDM, pages 1--17, 2016.Google Scholar
I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural networks. CoRR, abs/1409.3215, 2014.Google Scholar
G. Wu, X. He, and B. Eckart. An adaptive write buffer management scheme for flash-based ssds. TOS, 8(1):1:1--1:24, 2012. Google ScholarDigital Library
X. Wu, Y. Xu, Z. Shao, and S. Jiang. Lsm-trie: An lsm-tree-based ultra-large key-value store for small data items. In USENIX, pages 71--82, 2015. Google ScholarDigital Library
Z. Xie, Q. Cai, G. Chen, R. Mao, and M. Zhang. A comprehensive performance evaluation of modern in-memory indices. In ICDE, pages 641--652, 2018.Google ScholarCross Ref
Z. Xie, Q. Cai, H. V. Jagadish, B. C. Ooi, and W. Wong. PI : a parallel in-memory skip list based index. CoRR, abs/1601.00159, 2016.Google Scholar
Z. Xie, Q. Cai, H. V. Jagadish, B. C. Ooi, and W. Wong. Parallelizing skip lists for in-memory multi-core database systems. In ICDE, pages 119--122, 2017.Google ScholarCross Ref

Index Terms

S3: a scalable in-memory skip-list index for key-value store
1. Information systems
  1. Information retrieval
    1. Document representation
    2. Search engine architectures and scalability
      1. Search engine indexing

Index terms have been assigned to the content through auto-classification.

Recommendations

HopsFS-S3: Extending Object Stores with POSIX-like Semantics and more (industry track)
Middleware '20: Proceedings of the 21st International Middleware Conference Industrial Track

Object stores have become the de-facto platform for storage in the cloud due to their scalability, high availability, and low cost. However, they provide weaker metadata semantics and lower performance compared to distributed hierarchical file systems. ...
Read More
Building a database on S3
SIGMOD '08: Proceedings of the 2008 ACM SIGMOD international conference on Management of data

There has been a great deal of hype about Amazon's simple storage service (S3). S3 provides infinite scalability and high availability at low cost. Currently, S3 is used mostly to store multi-media documents (videos, photos, audio) which are shared by a ...
Read More
Amazon S3 for science grids: a viable solution?
DADC '08: Proceedings of the 2008 international workshop on Data-aware distributed computing

Amazon.com has introduced the Simple Storage Service (S3), a commodity-priced storage utility. S3 aims to provide storage as a low-cost, highly available service, with a simple 'pay-as-you-go' charging model. This article makes three contributions. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
Proceedings of the VLDB Endowment Volume 12, Issue 12
August 2019
547 pages
ISSN:2150-8097
Editors:
Lei Chen,
Fatma Özcan
Issue’s Table of Contents
Sponsors
In-Cooperation
Publisher
VLDB Endowment
Publication History
- Published: 1 August 2019
Published in pvldb Volume 12, Issue 12
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 6
  Total Citations
  View Citations
- 547
  Total Downloads
- Downloads (Last 12 months)126
- Downloads (Last 6 weeks)32
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

S3: a scalable in-memory skip-list index for key-value store

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Index Terms

Recommendations

HopsFS-S3: Extending Object Stores with POSIX-like Semantics and more (industry track)

Building a database on S3

Amazon S3 for science grids: a viable solution?

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

S3: a scalable in-memory skip-list index for key-value store

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Index Terms

Recommendations

HopsFS-S3: Extending Object Stores with POSIX-like Semantics and more (industry track)

Building a database on S3

Amazon S3 for science grids: a viable solution?

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media