research-article

Interleaving with coroutines: a practical approach for robust index joins

Authors:
Georgios Psaropoulos

EPFL, Lausanne, Switzerland and SAP SE, Walldorf, Germany

EPFL, Lausanne, Switzerland and SAP SE, Walldorf, Germany
View Profile

,
Thomas Legler

SAP SE, Walldorf, Germany

SAP SE, Walldorf, Germany
View Profile

,
Norman May

SAP SE, Walldorf, Germany

SAP SE, Walldorf, Germany
View Profile

,
Anastasia Ailamaki

EPFL, Lausanne, Switzerland and RAW Labs SA

EPFL, Lausanne, Switzerland and RAW Labs SA
View Profile

Proceedings of the VLDB Endowment Volume 11 Issue 2pp 230–242https://doi.org/10.14778/3149193.3149202

Published:01 October 2017Publication History

Proceedings of the VLDB Endowment

Abstract

Index join performance is determined by the efficiency of the lookup operation on the involved index. Although database indexes are highly optimized to leverage processor caches, main memory accesses inevitably increase lookup runtime when the index outsizes the last-level cache; hence, index join performance drops. Still, robust index join performance becomes possible with instruction stream interleaving: given a group of lookups, we can hide cache misses in one lookup with instructions from other lookups by switching among their respective instruction streams upon a cache miss.

In this paper, we propose interleaving with coroutines for any type of index join. We showcase our proposal on SAP HANA by implementing binary search and CSB⁺-tree traversal for an instance of index join related to dictionary compression. Coroutine implementations not only perform similarly to prior interleaving techniques, but also resemble the original code closely, while supporting both interleaved and non-interleaved execution. Thus, we claim that coroutines make interleaving practical for use in real DBMS codebases.

References

C# Reference. https://msdn.microsoft.com/en-us/library/9k7k7cf0.aspx {Online; accessed 14-August-2017}.Google Scholar
Generators. Python Wiki. https://wiki.python.org/ moin/Generators {Online; accessed 14-August-2017}.Google Scholar
Programming languages --- C++ extensions for coroutines.Google Scholar
Transaction Processing Performance Council. TPC-DS benchmark version 2.3.0. http://www.tpc.org/tpcds/ {Online; accessed 14-August-2017}.Google Scholar
A. Ailamaki, D. J. DeWitt, M. D. Hill, and D. A. Wood. DBMSs on a modern processor: where does time go? In Proc. VLDB, pages 266--277, 1999. Google ScholarDigital Library
S. Chen, A. Ailamaki, P. B. Gibbons, and T. C. Mowry. Improving hash join performance through prefetching. ACM Trans. Database Syst., 32(3), 2007. Google ScholarDigital Library
M. Colgan. Oracle Database In-Memory. Technical report, Oracle Corporation, 2015. http://www.oracle.com/technetwork/database/in-memory/overview/twp-oracle-database-in-memory-2245633.html.Google Scholar
M. E. Conway. Design of a separable transition-diagram compiler. Commun. ACM, 6(7):396--408, 1963. Google ScholarDigital Library
F. Färber, N. May, W. Lehner, P. Große, I. Müller, H. Rauhe, and J. Dees. The SAP HANA Database - an architecture overview. IEEE Data Eng. Bull., 35(1):28--33, 2012.Google Scholar
S. Idreos, F. Groffen, N. Nes, S. Manegold, S. Mullender, and M. Kersten. MonetDB: two decades of research in column-oriented database architectures. IEEE Data Eng. Bull., 35(1):40--45, 2012.Google Scholar
Intel Corporation. Intel® 64 and IA-32 Architectures Optimization Reference Manual, June 2016.Google Scholar
A. Kemper and T. Neumann. HyPer: a hybrid OLTP & OLAP main memory database system based on virtual memory snapshots. In Proc. ICDE, pages 195--206, 2011. Google ScholarDigital Library
M. Kerrisk. The Linux Programming Interface: A Linux and UNIX System Programming Handbook. Google ScholarDigital Library
C. Kim, J. Chhugani, N. Satish, E. Sedlar, A. D. Nguyen, T. Kaldewey, V. W. Lee, S. A. Brandt, and P. Dubey. FAST: fast architecture sensitive tree search on modern CPUs and GPUs. In Proc. SIGMOD, pages 339--350, 2010. Google ScholarDigital Library
O. Kocberber, B. Falsafi, and B. Grot. Asynchronous memory access chaining. PVLDB, 9(4):252--263, 2015. Google ScholarDigital Library
M. D. Lam, E. E. Rothberg, and M. E. Wolf. The cache performance and optimizations of blocked algorithms. In Proc. ASPLOS, 1991. Google ScholarDigital Library
H. Lang, T. Mühlbauer, F. Funke, P. Boncz, T. Neumann, and A. Kemper. Data Blocks: hybrid OLTP and OLAP on compressed storage using both vectorization and compilation. Proc. SIGMOD, 2016. Google ScholarDigital Library
P.-A. Larson, C. Clinciu, C. Fraser, E. N. Hanson, M. Mokhtar, M. Nowakiewicz, V. Papadimos, S. L. Price, S. Rangarajan, R. Rusanu, and M. Saubhasik. Enhancements to SQL Server column stores. In Proc. SIGMOD, pages 1159--1168, 2013. Google ScholarDigital Library
P.-A. Larson, C. Clinciu, E. N. Hanson, A. Oks, S. L. Price, S. Rangarajan, A. Surna, and Q. Zhou. SQL Server column store indexes. Proc. SIGMOD, pages 1177--1184, 2011. Google ScholarDigital Library
S. Manegold, P. A. Boncz, and M. L. Kersten. Optimizing database architecture for the new bottleneck: memory access. VLDB Journal, 9(3):231--246, 2000. Google ScholarDigital Library
J. Marusarz. Understanding how general exploration works in Intel VTune Amplifier XE, 2015. https://software.intel.com/en-us/articles/understanding-how-general-exploration-works-in-intel-vtune-amplifier-xe {Online; accessed 14-August-2017}.Google Scholar
T. J. McCabe. A complexity measure. IEEE Trans. Softw. Eng., 2(4):308--320, July 1976. Google ScholarDigital Library
J. Melton. Advanced SQL 1999: Understanding Object-Relational, and Other Advanced Features. Elsevier Science Inc., 2002. Google ScholarDigital Library
A. L. D. Moura and R. Ierusalimschy. Revisiting coroutines. ACM Trans. Program. Lang. Syst., 31(2):6:1--6:31, Feb. 2009. Google ScholarDigital Library
I. Müller, C. Ratsch, and F. Färber. Adaptive string dictionary compression in in-memory column-store database systems. In Proc. EDBT, pages 283--294, 2014.Google Scholar
M. Poess and D. Potapov. Data compression in Oracle. Proc. VLDB, pages 937--947, 2003. Google ScholarDigital Library
V. Raman, G. Attaluri, R. Barber, N. Chainani, D. Kalmuk, V. KulandaiSamy, J. Leenstra, S. Lightstone, S. Liu, G. M. Lohman, T. Malkemus, R. Mueller, I. Pandis, B. Schiefer, D. Sharpe, R. Sidle, A. Storm, and L. Zhang. DB2 with BLU Acceleration: so much more than just a column store. PVLDB, 6(11):1080--1091, 2013. Google ScholarDigital Library
J. Rao and K. A. Ross. Making B+-trees cache conscious in main memory. In Proc. ACM SIGMOD, pages 475--486, 2000. Google ScholarDigital Library
RethinkDB Team. Improving a large C++ project with coroutines, 2010. https://www.rethinkdb.com/blog/improving-a-large-c-project-with-coroutines/ {Online; accessed 14-August-2016}.Google Scholar
M. E. Russinovich, D. A. Solomon, and A. Ionescu. Windows Internals, Part 1: Covering Windows Server 2008 R2 and Windows 7. Microsoft Press, 6th edition, 2012. Google ScholarDigital Library
J. Zhou, J. Cieslewicz, K. A. Ross, and M. Shah. Improving database performance on simultaneous multithreading processors. In Proc. VLDB, pages 49--60, 2005. Google ScholarDigital Library
J. Zhou and K. A. Ross. Buffering accesses to memory-resident index structures. In Proc. VLDB, pages 405--416, 2003. Google ScholarDigital Library

Recommendations

Exploiting coroutines to attack the "killer nanoseconds"

Database systems use many pointer-based data structures, including hash tables and B+-trees, which require extensive "pointer-chasing." Each pointer dereference, e.g., during a hash probe or a B+-tree traversal, can result in a CPU cache miss, stalling ...
Read More
Interleaving with coroutines: a systematic and practical approach to hide memory latency in index joins
Abstract
Index joins present a case of pointer-chasing code that causes data cache misses. In principle, we can hide these cache misses by overlapping them with computation: The lookups involved in an index join are parallel tasks whose execution can be ...
Read More
Interleaving: a multithreading technique targeting multiprocessors and workstations
ASPLOS VI: Proceedings of the sixth international conference on Architectural support for programming languages and operating systems

There is an increasing trend to use commodity microprocessors as the compute engines in large-scale multiprocessors. However, given that the majority of the microprocessors are sold in the workstation market, not in the multiprocessor market, it is only ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
Proceedings of the VLDB Endowment Volume 11, Issue 2
October 2017
176 pages
ISSN:2150-8097
Editors:
Jian Pei
Simon Fraser University
,
Sihem Amer-Yahia
University of Grenoble Alpes, CNRS
Issue’s Table of Contents
Sponsors
In-Cooperation
Publisher
VLDB Endowment
Publication History
- Published: 1 October 2017
Published in pvldb Volume 11, Issue 2
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 24
  Total Citations
  View Citations
- 424
  Total Downloads
- Downloads (Last 12 months)112
- Downloads (Last 6 weeks)9
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Interleaving with coroutines: a practical approach for robust index joins

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Recommendations

Exploiting coroutines to attack the "killer nanoseconds"

Interleaving with coroutines: a systematic and practical approach to hide memory latency in index joins

Interleaving: a multithreading technique targeting multiprocessors and workstations

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Interleaving with coroutines: a practical approach for robust index joins

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Recommendations

Exploiting coroutines to attack the "killer nanoseconds"

Interleaving with coroutines: a systematic and practical approach to hide memory latency in index joins

Interleaving: a multithreading technique targeting multiprocessors and workstations

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media