research-article

Design and evaluation of main memory hash join algorithms for multi-core CPUs

Authors:
Spyros Blanas

University of Wisconsin-Madison, Madison, WI, USA

University of Wisconsin-Madison, Madison, WI, USA
View Profile

,
Yinan Li

University of Wisconsin-Madison, Madison, WI, USA

University of Wisconsin-Madison, Madison, WI, USA
View Profile

,
Jignesh M. Patel

University of Wisconsin-Madison, Madison, WI, USA

University of Wisconsin-Madison, Madison, WI, USA
View Profile

SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of dataJune 2011Pages 37–48https://doi.org/10.1145/1989323.1989328

Published:12 June 2011Publication History

SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data

Pages 37–48

ABSTRACT

The focus of this paper is on investigating efficient hash join algorithms for modern multi-core processors in main memory environments. This paper dissects each internal phase of a typical hash join algorithm and considers different alternatives for implementing each phase, producing a family of hash join algorithms. Then, we implement these main memory algorithms on two radically different modern multi-processor systems, and carefully examine the factors that impact the performance of each method.

Our analysis reveals some interesting results -- a very simple hash join algorithm is very competitive to the other more complex methods. This simple join algorithm builds a shared hash table and does not partition the input relations. Its simplicity implies that it requires fewer parameter settings, thereby making it far easier for query optimizers and execution engines to use it in practice. Furthermore, the performance of this simple algorithm improves dramatically as the skew in the input data increases, and it quickly starts to outperform all other algorithms.

References

A. Ailamaki, D. J. DeWitt, M. D. Hill, and D. A. Wood. DBMSs on a modern processor: Where does time go? In VLDB, pages 266--277, 1999. Google ScholarDigital Library
P. A. Boncz, S. Manegold, and M. L. Kersten. Database architecture optimized for the new bottleneck: Memory access. In VLDB, pages 54--65, 1999. Google ScholarDigital Library
J. Cieslewicz, W. Mee, and K. A. Ross. Cache-conscious buffering for database operators with state. In DaMoN, pages 43--51, 2009. Google ScholarDigital Library
J. Cieslewicz and K. A. Ross. Data partitioning on chip multiprocessors. In DaMoN, pages 25--34, 2008. Google ScholarDigital Library
J. Cieslewicz, K. A. Ross, and I. Giannakakis. Parallel buffers for chip multiprocessors. In DaMoN, 2007. Google ScholarDigital Library
D. J. DeWitt and J. Gray. Parallel database systems: The future of database processing or a passing fad? SIGMOD Record, 19(4):104--112, 1990. Google ScholarDigital Library
D. J. DeWitt, R. H. Katz, F. Olken, L. D. Shapiro, M. Stonebraker, and D. A. Wood. Implementation techniques for main memory database systems. In SIGMOD Conference, pages 1--8, 1984. Google ScholarDigital Library
D. J. DeWitt, J. F. Naughton, D. A. Schneider, and S. Seshadri. Practical skew handling in parallel joins. In VLDB, pages 27--40, 1992. Google ScholarDigital Library
P. Garcia and H. F. Korth. Database hash-join algorithms on multithreaded computer architectures. In Conf. Computing Frontiers, pages 241--252, 2006. Google ScholarDigital Library
G. Graefe, R. Bunker, and S. Cooper. Hash joins and hash teams in Microsoft SQL Server. In VLDB, pages 86--97, 1998. Google ScholarDigital Library
C. Kim, E. Sedlar, J. Chhugani, T. Kaldewey, A. D. Nguyen, A. D. Blas, V. W. Lee, N. Satish, and P. Dubey. Sort vs. hash revisited: Fast join implementation on modern multi-core CPUs. PVLDB, 2(2):1378--1389, 2009. Google ScholarDigital Library
S. Manegold, P. A. Boncz, and M. L. Kersten. What happens during a join? Dissecting CPU and memory optimization effects. In VLDB, pages 339--350, 2000. Google ScholarDigital Library
S. Manegold, P. A. Boncz, and M. L. Kersten. Optimizing main-memory join on modern hardware. IEEE Trans. Knowl. Data Eng., 14(4):709--730, 2002. Google ScholarDigital Library
R. Pagh and F. F. Rodler. Cuckoo hashing. J. Algorithms, 51(2):122--144, 2004. Google ScholarDigital Library
K. A. Ross. Efficient hash probes on modern processors. In ICDE, pages 1297--1301, 2007.Google ScholarCross Ref
A. Shatdal, C. Kant, and J. F. Naughton. Cache conscious algorithms for relational query processing. In VLDB, pages 510--521, 1994. Google ScholarDigital Library
M. Stonebraker. The case for shared nothing. In HPTS, 1985.Google Scholar

Index Terms

Design and evaluation of main memory hash join algorithms for multi-core CPUs
1. Information systems
  1. Data management systems
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Database theory
      1. Database query processing and optimization (theory)

Recommendations

Multi-core, main-memory joins: sort vs. hash revisited

In this paper we experimentally study the performance of main-memory, parallel, multi-core join algorithms, focusing on sort-merge and (radix-)hash join. The relative performance of these two join approaches have been a topic of discussion for a long ...
Read More
Performance evaluation of main-memory hash joins on KNL

New hardware features have propelled designs and analysis in main-memory hash joins. In previous studies, memory access has always been the primary bottleneck for hash join algorithms. However, there are relatively few studies devoted to bottlenecks ...
Read More
Performance Gaps between OpenMP and OpenCL for Multi-core CPUs
ICPPW '12: Proceedings of the 2012 41st International Conference on Parallel Processing Workshops

OpenCL and OpenMP are the most commonly used programming models for multi-core processors. They are also fundamentally different in their approach to parallelization. In this paper, we focus on comparing the performance of OpenCL and OpenMP. We select ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
June 2011
1364 pages
ISBN:9781450306614
DOI:10.1145/1989323
General Chair:
Timos Sellis
IMIS/RC Athena
,
Program Chair:
Renée J. Miller
University of Toronto
,
Publications Chairs:
Anastasios Kementsietsidis
IBM T.J. Watson Research Center
,
Yannis Velegrakis
University of Trento
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 June 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
hash join
main memory
multi-core
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate785of4,003submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 178
  Total Citations
  View Citations
- 1,522
  Total Downloads
- Downloads (Last 12 months)106
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Design and evaluation of main memory hash join algorithms for multi-core CPUs

SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data

ABSTRACT

References

Cited By

Index Terms

Recommendations

Multi-core, main-memory joins: sort vs. hash revisited

Performance evaluation of main-memory hash joins on KNL

Performance Gaps between OpenMP and OpenCL for Multi-core CPUs

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Design and evaluation of main memory hash join algorithms for multi-core CPUs

SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data

ABSTRACT

References

Cited By

Index Terms

Recommendations

Multi-core, main-memory joins: sort vs. hash revisited

Performance evaluation of main-memory hash joins on KNL

Performance Gaps between OpenMP and OpenCL for Multi-core CPUs

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media