research-article

In-memory performance for big data

Authors:
Goetz Graefe

HP Labs, Palo Alto

HP Labs, Palo Alto
View Profile

,
Haris Volos

HP Labs, Palo Alto

HP Labs, Palo Alto
View Profile

,
Hideaki Kimura

HP Labs, Palo Alto

HP Labs, Palo Alto
View Profile

,
Harumi Kuno

HP Labs, Palo Alto

HP Labs, Palo Alto
View Profile

,
Joseph Tucek

HP Labs, Palo Alto

HP Labs, Palo Alto
View Profile

,
Mark Lillibridge

HP Labs, Palo Alto

HP Labs, Palo Alto
View Profile

,
Alistair Veitch

Google

Google
View Profile

Proceedings of the VLDB Endowment Volume 8 Issue 1pp 37–48https://doi.org/10.14778/2735461.2735465

Published:01 September 2014Publication History

Proceedings of the VLDB Endowment

Abstract

When a working set fits into memory, the overhead imposed by the buffer pool renders traditional databases non-competitive with in-memory designs that sacrifice the benefits of a buffer pool. However, despite the large memory available with modern hardware, data skew, shifting workloads, and complex mixed workloads make it difficult to guarantee that a working set will fit in memory. Hence, some recent work has focused on enabling in-memory databases to protect performance when the working data set almost fits in memory. Contrary to those prior efforts, we enable buffer pool designs to match in-memory performance while supporting the "big data" workloads that continue to require secondary storage, thus providing the best of both worlds. We introduce here a novel buffer pool design that adapts pointer swizzling for references between system objects (as opposed to application objects), and uses it to practically eliminate buffer pool overheads for memoryresident data. Our implementation and experimental evaluation demonstrate that we achieve graceful performance degradation when the working set grows to exceed the buffer pool size, and graceful improvement when the working set shrinks towards and below the memory and buffer pool sizes.

References

T. Anderson. Microsoft SQL Server 14 man: 'Nothing stops a Hekaton transaction'. http://www.theregister.co.uk/2013/06/03/microsoft_sql_server_14_teched/, 2013.Google Scholar
M. P. Atkinson, K. Chisholm, W. P. Cockshott, and R. Marshall. Algorithms for a Persistent Heap. Softw., Pract. Exper., 13(3):259--271, 1983.Google ScholarCross Ref
M. J. Carey, D. J. DeWitt, M. J. Franklin, N. E. Hall, M. L. McAuliffe, J. F. Naughton, D. T. Schuh, M. H. Solomon, C. K. Tan, O. G. Tsatalos, S. J. White, and M. J. Zwilling. Shoring up persistent applications. In SIGMOD, pages 383--394, 1994. Google ScholarDigital Library
J. DeBrabant, A. Pavlo, S. Tu, M. Stonebraker, and S. B. Zdonik. Anti-Caching: A New Approach to Database Management System Architecture. PVLDB, 6(14):1942--1953, 2013. Google ScholarDigital Library
C. Diaconu, C. Freedman, E. Ismert, P.-A. Larson, P. Mittal, R. Stonecipher, N. Verma, and M. Zwilling. Hekaton: SQL Server's memory-optimized OLTP engine. SIGMOD, 2013. Google ScholarDigital Library
FAL Labs. Tokyo Cabinet: a modern implementation of DBM. http://fallabs.com/tokyocabinet/.Google Scholar
F. Funke, A. Kemper, and T. Neumann. Compacting Transactional Data in Hybrid OLTP & OLAP Databases. PVLDB, 5(11):1424--1435, 2012. Google ScholarDigital Library
H. Garcia-Molina, J. D. Ullman, and J. Widom. Database system implementation, volume 654. Prentice Hall Upper Saddle River, NJ, 2000.Google ScholarDigital Library
G. Graefe. A Survey of B-tree Locking Techniques. ACM TODS, 35(2): 16:1--16:26, 2010. Google ScholarDigital Library
G. Graefe. Modern B-tree techniques. Foundations and Trends in Databases, 3(4): 203--402, 2011. Google ScholarDigital Library
G. Graefe. A Survey of B-tree Logging and Recovery Techniques. ACM TODS, 37(1): 1:1--1:35, 2012. Google ScholarDigital Library
G. Graefe, H. Kimura, and H. Kuno. Foster B-Trees. ACM Transactions on Database Systems (TODS), 2012. Google ScholarDigital Library
SAP HANA. http://www.saphana.com/.Google Scholar
S. Harizopoulos, D. Abadi, S. Madden, and M. Stonebraker. OLTP Through the Looking Glass, and What We Found There. In SIGMOD, 2008. Google ScholarDigital Library
A. L. Hosking and J. E. B. Moss. Object Fault Handling for Persistent Programming Languages: A Performance Evaluation. In OOPSLA, pages 288--303, 1993. Google ScholarDigital Library
R. Johnson, I. Pandis, N. Hardavellas, A. Ailamaki, and B. Falsafi. Shore-MT: a scalable storage manager for the multicore era. In EDBT, pages 24--35, 2009. Google ScholarDigital Library
R. Johnson, I. Pandis, R. Stoica, M. Athanassoulis, and A. Ailamaki. Aether: a scalable approach to logging. Proceedings of the VLDB Endowment, 3(1-2):681--692, 2010. Google ScholarDigital Library
H. Jung, H. Han, A. D. Fekete, G. Heiser, and H. Y. Yeom. A scalable lock manager for multicores. In SIGMOD, pages 73--84. ACM, 2013. Google ScholarDigital Library
T. Kaehler and G. Krasner. LOOM: Large Object-Oriented Memory for Smalltalk-80 Systems. In S. B. Zdonik and D. Maier, editors, Readings in Object-Oriented Database Systems, pages 298--307. Kaufmann, San Mateo, CA, 1990. Google ScholarDigital Library
A. Kemper and D. Kossmann. Adaptable Pointer Swizzling Strategies in Object Bases. In ICDE, pages 155--162, 1993. Google ScholarDigital Library
A. Kemper and D. Kossmann. Dual-Buffering Strategies in Object Bases. In VLDB, pages 427--438, 1994. Google ScholarDigital Library
A. Kemper and D. Kossmann. Adaptable Pointer Swizzling Strategies in Object Bases: Design, Realization, and Quantitative Analysis. VLDB J., 4(3):519--566, 1995. Google ScholarDigital Library
K. Küspert. Fehlererkennung und Fehlerbehandlung in Speicherungsstrukturen von Datenbanksystemen. Informatik-Fachberichte. Springer-Verlag, 1985.Google ScholarCross Ref
T. J. Lehman and M. J. Carey. A study of index structures for main memory database management systems. In VLDB, VLDB '86, pages 294--303, San Francisco, CA, USA, 1986. Morgan Kaufmann Publishers Inc. Google ScholarDigital Library
J. J. Levandoski, P.-A. Larson, and R. Stoica. Identifying hot and cold data in main-memory databases. In ICDE, pages 26--37, 2013. Google ScholarDigital Library
M. L. McAuliffe and M. H. Solomon. A trace-based simulation of pointer swizzling techniques. In ICDE, pages 52--61, 1995. Google ScholarDigital Library
C. Mohan. Disk read-write optimizations and data integrity in transaction systems using write-ahead logging. In ICDE, pages 324--331, 1995. Google ScholarDigital Library
MonetDB. http://www.monetdb.org/.Google Scholar
J. E. B. Moss. Working with persistent objects: To swizzle or not to swizzle. IEEE Trans. Software Eng., 18(8):657--673, 1992. Google ScholarDigital Library
V. F. Nicola, A. Dan, and D. M. Dias. Analysis of the generalized clock buffer replacement scheme for database transaction processing. SIGMETRICS Perform. Eval. Rev., 20(1):35--46, June 1992. Google ScholarDigital Library
Oracle TimesTen In-Memory Database. http://www.oracle.com/technetwork/products/timesten/overview/index.html.Google Scholar
I. Pandis, R. Johnson, N. Hardavellas, and A. Ailamaki. Data-oriented transaction execution. PVLDB, 3(1):928--939, 2010. Google ScholarDigital Library
I. Pandis, P. Tozun, R. Johnson, and A. Ailamaki. PLP: Page latch-free shared-everything OLTP. PVLDB, 2011. Google ScholarDigital Library
S. Park. Personal Communication, 2013.Google Scholar
S. Park, T. Kelly, and K. Shen. Failure-atomic msync(): A simple and efficient mechanism for preserving the integrity of durable data. In EuroSys '13, 2013. Google ScholarDigital Library
A. J. Smith. Sequentiality and Prefetching in Database Systems. ACM TODS, 3(3):223--247, Sept. 1978. Google ScholarDigital Library
R. Stoica and A. Ailamaki. Enabling efficient OS paging for main-memory OLTP databases. In DaMoN, page 7, 2013. Google ScholarDigital Library
S. Tu, W. Zheng, E. Kohler, B. Liskov, and S. Madden. Speedy transactions in multicore in-memory databases. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, pages 18--32. ACM, 2013. Google ScholarDigital Library
VoltDB. http://www.voltdb.com.Google Scholar
S. J. White and D. J. DeWitt. QuickStore: A high performance mapped object store, volume 23. ACM, 1994. Google ScholarDigital Library
P. Wilson and S. V. Kakkad. Pointer swizzling at page fault time: Efficiently and compatibly supporting huge address spaces on standard hardware. In Computer Architecture News, pages 364--377, 1992. Google ScholarDigital Library

Index Terms

In-memory performance for big data
1. Information systems
  1. Data management systems
    1. Database design and models
    2. Database management system engines

Index terms have been assigned to the content through auto-classification.

Recommendations

Performance Impact of Emerging Memory Technologies on Big Data Applications: A Latency-Programmable System Emulation Approach
GLSVLSI '18: Proceedings of the 2018 on Great Lakes Symposium on VLSI

This paper presents a performance analysis framework for studying emerging memories. The key component of the framework is a memory-latency programmable emulator, which is based on a FPGA-attached server system. The emulator allows users extend read and/...
Read More
Energy efficient Phase Change Memory based main memory for future high performance systems
IGCC '11: Proceedings of the 2011 International Green Computing Conference and Workshops

Phase Change Memory (PCM) has recently attracted a lot of attention as a scalable alternative to DRAM for main memory systems. As the need for high-density memory increases, DRAM has proven to be less attractive from the point of view of scaling and ...
Read More
Bridging the I/O performance gap for big data workloads: a new NVDIMM-based approach
MICRO-49: The 49th Annual IEEE/ACM International Symposium on Microarchitecture

The long I/O latency posts significant challenges for many data-intensive applications, such as the emerging big data workloads. Recently, the NVDIMM (Non-Volatile Dual In-line Memory Module) technologies provide a promising solution to this problem. By ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
Proceedings of the VLDB Endowment Volume 8, Issue 1
September 2014
100 pages
ISSN:2150-8097
Editors:
Chen Li
University of California, Irvine
,
Volker Markl
TU Berlin
Issue’s Table of Contents
Sponsors
In-Cooperation
Publisher
VLDB Endowment
Publication History
- Published: 1 September 2014
Published in pvldb Volume 8, Issue 1
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 21
  Total Citations
  View Citations
- 431
  Total Downloads
- Downloads (Last 12 months)42
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

In-memory performance for big data

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Index Terms

Recommendations

Performance Impact of Emerging Memory Technologies on Big Data Applications: A Latency-Programmable System Emulation Approach

Energy efficient Phase Change Memory based main memory for future high performance systems

Bridging the I/O performance gap for big data workloads: a new NVDIMM-based approach

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

In-memory performance for big data

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Index Terms

Recommendations

Performance Impact of Emerging Memory Technologies on Big Data Applications: A Latency-Programmable System Emulation Approach

Energy efficient Phase Change Memory based main memory for future high performance systems

Bridging the I/O performance gap for big data workloads: a new NVDIMM-based approach

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media