research-article

GPU join processing revisited

Authors:
Tim Kaldewey

IBM Almaden Research, San Jose, CA

IBM Almaden Research, San Jose, CA
View Profile

,
Guy Lohman

IBM Almaden Research, San Jose, CA

IBM Almaden Research, San Jose, CA
View Profile

,
Rene Mueller

IBM Almaden Research, San Jose, CA

IBM Almaden Research, San Jose, CA
View Profile

,
Peter Volk

Technische Universität Dresden

Technische Universität Dresden
View Profile

DaMoN '12: Proceedings of the Eighth International Workshop on Data Management on New HardwareMay 2012Pages 55–62https://doi.org/10.1145/2236584.2236592

Published:21 May 2012Publication History

DaMoN '12: Proceedings of the Eighth International Workshop on Data Management on New Hardware

Pages 55–62

ABSTRACT

Until recently, the use of graphics processing units (GPUs) for query processing was limited by the amount of memory on the graphics card, a few gigabytes at best. Moreover, input tables had to be copied to GPU memory before they could be processed, and after computation was completed, query results had to be copied back to CPU memory. The newest generation of Nvidia GPUs and development tools introduces a common memory address space, which now allows the GPU to access CPU memory directly, lifting size limitations and obviating data copy operations. We confirm that this new technology can sustain 98% of its nominal rate of 6.3 GB/sec in practice, and exploit it to process database hash joins at the same rate, i.e., the join is processed "on the fly" as the GPU reads the input tables from CPU memory at PCI-E speeds. Compared to the fastest published results for in-memory joins on the CPU, this represents more than half an order of magnitude speed-up. All of our results include the cost of result materialization (often omitted in earlier work), and we investigate the implications of changing join predicate selectivity and table size.

References

A. Ailamaki, D. J. DeWitt, M. D. Hill, and D. A. Wood. DBMSs on a modern processor: Where does time go? In VLDB'99. Google ScholarDigital Library
D. A. Alcantara, V. Volkov, S. Sengupta, M. Mitzenmacher, J. D. Owens, and N. Ameta. GPU Computing Gems: Jade Edition, chapter 4, pages 39--53. Morgan Kaufmann, 2012.Google Scholar
S. Blanas, Y. Li, and J. M. Patel. Design and evaluation of main memory hash join algorithms for multi-core CPUs. In SIGMOD'11. Google ScholarDigital Library
P. A. Boncz, S. Manegold, and M. L. Kersten. Database architecture optimized for the new bottleneck: Memory access. In VLDB'99. Google ScholarDigital Library
R. Budruck, D. Anderson, and T. Shanley. PCI Express System Architecture. Addison-Wesley, 2003. Google ScholarDigital Library
M. Garland, S. Le Grand, J. Nickolls, J. Anderson, J. Hardwick, S. Morton, E. Phillips, Y. Zhang, and V. Volkov. Parallel computing experiences with CUDA. IEEE Micro, 28(4). Google ScholarDigital Library
N. K. Govindaraju and D. Manocha. Efficient relational database management using graphics processors. In DaMoN'05. Google ScholarDigital Library
B. He, M. Lu, K. Yang, R. Fang, N. K. Govindaraju, Q. Luo, and P. V. Sander. Relational query coprocessing on graphics processors. ACM Trans. Database Syst., 34(4), Dec. 2009. Google ScholarDigital Library
B. He, K. Yang, R. Fang, M. Lu, N. Govindaraju, Q. Luo, and P. Sander. Relational joins on graphics processors. In SIGMOD'08. Google ScholarDigital Library
C. Kim, T. Kaldewey, V. W. Lee, E. Sedlar, A. D. Nguyen, N. Satish, J. Chhugani, A. Di Blas, and P. Dubey. Sort vs. Hash revisited: fast join implementation on modern multi-core CPUs. Proc. VLDB Endow., 2(2), Aug. 2009. Google ScholarDigital Library
S. Manegold, P. Boncz, and M. Kersten. Optimizing main-memory join on modern hardware. IEEE Trans. on Knowledge and Data Engineering, 14. Google ScholarDigital Library
H. Pirk, S. Manegold, and M. Kersten. Accelerating foreign-key joins using asymmetric memory channels. In ADMS'11.Google Scholar

Recommendations

Accelerating the discontinuous Galerkin method for seismic wave propagation simulations using the graphic processing unit (GPU)-single-GPU implementation

We have successfully ported an arbitrary high-order discontinuous Galerkin (ADER-DG) method for solving the three-dimensional elastic seismic wave equation on unstructured tetrahedral meshes to an Nvidia Tesla C2075 GPU using the Nvidia CUDA programming ...
Read More
Scheduling Techniques for GPU Architectures with Processing-In-Memory Capabilities
PACT '16: Proceedings of the 2016 International Conference on Parallel Architectures and Compilation

Processing data in or near memory (PIM), as opposed to in conventional computational units in a processor, can greatly alleviate the performance and energy penalties of data transfers from/to main memory. Graphics Processing Unit (GPU) architectures and ...
Read More
HG-Bitmap Join Index: A Hybrid GPU/CPU Bitmap Join Index Mechanism for OLAP
Web Information Systems Engineering – WISE 2013 Workshops
Abstract
In-memory big data OLAP(on-line analytical processing) is time consuming task for data access latency and complex star join processing overhead. GPU is introduced to DBMSs for its remarkable parallel computing power but also restricted by its ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
DaMoN '12: Proceedings of the Eighth International Workshop on Data Management on New Hardware
May 2012
72 pages
ISBN:9781450314459
DOI:10.1145/2236584
Editors:
Shimin Chen
HP Labs China
,
Stavros Harizopoulos
Nou Data
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 May 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate80of102submissions,78%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 107
  Total Citations
  View Citations
- 919
  Total Downloads
- Downloads (Last 12 months)63
- Downloads (Last 6 weeks)7
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

GPU join processing revisited

DaMoN '12: Proceedings of the Eighth International Workshop on Data Management on New Hardware

ABSTRACT

References

Cited By

Recommendations

Accelerating the discontinuous Galerkin method for seismic wave propagation simulations using the graphic processing unit (GPU)-single-GPU implementation

Scheduling Techniques for GPU Architectures with Processing-In-Memory Capabilities

HG-Bitmap Join Index: A Hybrid GPU/CPU Bitmap Join Index Mechanism for OLAP

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

GPU join processing revisited

DaMoN '12: Proceedings of the Eighth International Workshop on Data Management on New Hardware

ABSTRACT

References

Cited By

Recommendations

Accelerating the discontinuous Galerkin method for seismic wave propagation simulations using the graphic processing unit (GPU)-single-GPU implementation

Scheduling Techniques for GPU Architectures with Processing-In-Memory Capabilities

HG-Bitmap Join Index: A Hybrid GPU/CPU Bitmap Join Index Mechanism for OLAP

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media