Abstract
We present Nitro, a high-performance in-memory key-value storage engine used in Couchbase 4.5 Global Secondary Indexes. The Nitro storage engine is well suited for the recent hardware trends like large amounts of memory and many CPU cores. The storage engine leverages latch-free data structures and tries to achieve linear scalability for the index read-write operations. The Nitro storage engine offers concurrent readers and writers, lightweight database snapshots, stable scan, backup and recovery operations.
We integrated Nitro into the Couchbase Global Secondary Indexes (GSI) and observed significant improvement in performance compared to our disk oriented storage engine configured with the same amount of memory for buffer cache. On a 32 core machine, we observed an end-to-end GSI server insertion throughput of 1,650,000 entries/sec and index update throughput of 822,000 entries/sec. A single instance of Nitro data structure running on a 40 core machine achieved a peak insertion throughput of 4 million index entries/sec and entry lookup throughput of 10 million lookups/sec.
- Couchbase Multidimensional Scaling Overview. http://www.couchbase.com/multi-dimensional-scalability-overview.Google Scholar
- Couchbase NoSQL Database. http://www.couchbase.com/.Google Scholar
- Database Change Protocol. http://docs.couchbase.com/admin/admin/Concepts/dcp.html.Google Scholar
- MemSQL distributed database. http://www.memsql.com/.Google Scholar
- S. K. Cha and C. Song. P*TIME: highly scalable OLTP DBMS for managing update-intensive stream workload. Proceedings of the Thirtieth international conference on Very large data bases, pages 1033--1044, 2004. Google ScholarDigital Library
- D. Comer. Ubiquitous B-Tree. ACM Computing Surveys, 11(2):121--137, 1979. Google ScholarDigital Library
- E. I. P.-. L. P. M. R. S. N. V. M. Z. Cristian Diaconu, Craig Freedman. Hekaton: SQL Servers Memory-Optimized OLTP Engine. Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pages 1243--1254, 2013. Google ScholarDigital Library
- S. S. David B. Lomet and J. J. Levandoski. The Bw-Tree: A B-tree for new hardware platforms. Proceedings of the 2013 IEEE International Conference on Data Engineering (ICDE 2013), pages 302--313, 2013. Google ScholarDigital Library
- J. P.-C. B. S. S. Franz Farber, Sang Kyun Cha and W. Lehner. SAP HANA Database - Data Management for Modern Business Applications. ACM SIGMOD Record, pages 45--51, December. Google ScholarDigital Library
- J. P.-C. B. S. S. Franz Farber, Sang Kyun Cha and W. Lehner. Speedy transactions in multicore in-memory databases. Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, pages 18--32, 2013. Google ScholarDigital Library
- K. Fraser. Practical lock-freedom. PhD thesis, University of Cambridge, 2004.Google Scholar
- M. Herlihy and N. Shavit. The Art of Multiprocessor Programming. Morgan Kaufmann, "2012". Google ScholarDigital Library
- J. L. J. Chris Anderson and N. Slater. CouchDB: The Definitive Guide Time to Relax. O'Reilly Media, Inc, "2010". Google ScholarDigital Library
- E. Kohler and R. T. Morris. Cache craftiness for fast multicore key-value storage. Proceedings of the 7th ACM european conference on Computer Systems, pages 183--196, 2012. Google ScholarDigital Library
- M. M. Michael. Hazard Pointers: Safe Memory Reclamation for Lock-Free Objects. IEEE Transactions on Parallel and Distributed Systems, 15(6):491--504, 2004. Google ScholarDigital Library
- C. M. Ohad Rodeh, Josef Bacik. BTRFS: The Linux B-Tree Filesystem. ACM Transactions on Storage, 9(3), August 2013. Google ScholarDigital Library
- W. Pugh. Skip lists: a probabilistic alternative to balanced trees. Communications of the ACM, 33(6):668--676, June 1990. Google ScholarDigital Library
- K. K. Sang K. Cha, Sangyong Hwang and K. Kwon. Cache-Conscious Concurrency Control of Main-Memory Indexes on Shared-Memory Multiprocessor Systems. Proceedings of the 27th International Conference on Very Large Data Bases, pages 181--190, 2001. Google ScholarDigital Library
- S. M. Stavros Harizopoulos, Daniel J. Abadi and M. Stonebraker. OLTP Through the Looking Glass, and What We Found There. Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 981--992, 2008. Google ScholarDigital Library
- M. Stonebraker and A. Weisberg. The VoltDB Main Memory DBMS. IEEE Data Eng. Bull., 36(2):21--27, 2013.Google Scholar
- H. Sundell and P. Tsigas. Fast and lock-free concurrent priority queues for multi-thread systems. Journal of Parallel and Distributed Computing, 65(5):609--627, May 2005. Google ScholarDigital Library
- J. D. Valois. Lock-free linked lists using compare-and-swap. Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing, pages 214--222, 1995. Google ScholarDigital Library
Recommendations
Nitro: a capacity-optimized SSD cache for primary storage
USENIX ATC'14: Proceedings of the 2014 USENIX conference on USENIX Annual Technical ConferenceFor many primary storage customers, storage must balance the requirements for large capacity, high performance, and low cost. A well studied technique is to place a solid state drive (SSD) cache in front of hard disk drive (HDD) storage, which can ...
Nitro: hardware-based system call tracing for virtual machines
IWSEC'11: Proceedings of the 6th International conference on Advances in information and computer securityVirtual machine introspection (VMI) describes the method of monitoring and analyzing the state of a virtual machine from the hypervisor level. This lends itself well to security applications, though the hardware virtualization support from Intel and AMD ...
Nitro: network-aware virtual machine image management in geo-distributed clouds
CCGrid '18: Proceedings of the 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid ComputingRecently, most large cloud providers, like Amazon and Microsoft, replicate their Virtual Machine Images (VMIs) on multiple geographically distributed data centers to offer fast service provisioning. Provisioning a service may require to transfer a VMI ...
Comments