article

The Yin and Yang of processing data warehousing queries on GPU devices

Authors:
Yuan Yuan

Department of Computer Science and Engineering, The Ohio State University

Department of Computer Science and Engineering, The Ohio State University
View Profile

,
Rubao Lee

Department of Computer Science and Engineering, The Ohio State University

Department of Computer Science and Engineering, The Ohio State University
View Profile

,
Xiaodong Zhang

Department of Computer Science and Engineering, The Ohio State University

Department of Computer Science and Engineering, The Ohio State University
View Profile

Proceedings of the VLDB Endowment Volume 6 Issue 10pp 817–828https://doi.org/10.14778/2536206.2536210

Published:01 August 2013Publication History

Proceedings of the VLDB Endowment

Abstract

Database community has made significant research efforts to optimize query processing on GPUs in the past few years. However, we can hardly find that GPUs have been truly adopted in major warehousing production systems. Preparing to merge GPUs to the warehousing systems, we have identified and addressed several critical issues in a three-dimensional study of warehousing queries on GPUs by varying query characteristics, software techniques, and GPU hardware configurations. We also propose an analytical model to understand and predict the query performance on GPUs. Based on our study, we present our performance insights for warehousing query execution on GPUs. The objective of our work is to provide a comprehensive guidance for GPU architects, software system designers, and database practitioners to narrow the speed gap between the GPU kernel execution (the fast mode) and data transfer to prepare GPU execution (the slow mode) for high performance in processing data warehousing queries. The GPU query engine developed in this work is open source to the public.

References

Amd accelerated parallel processing opencl programming guide (v2.8). http://developer.amd. com/download/AMD_Accelerated_Parallel_ Processing_OpenCL_Programming_Guide.pdf.Google Scholar
Cuda c programming guide 5.0. http://docs.nvidia. com/cuda/pdf/CUDA_C_Programming_Guide.pdf.Google Scholar
Global memory usage and strategy. http://developer.download.nvidia.com/CUDA/training/cuda_webinars_GlobalMemory.pdf.Google Scholar
Opencl. http://www.khronos.org/opencl.Google Scholar
Opencl programming guide for the cuda architecture. http://www.nvidia.com/content/cudazone/download/OpenCL/NVIDIA_OpenCL_ProgrammingGuide.pdf.Google Scholar
D. Abadi, D. Myers, D. DeWitt, and S. Madden. Materialization strategies in a column-oriented dbms. In ICDE, pages 466-475, April 2007.Google Scholar
D. J. Abadi, S. Madden, and M. Ferreira. Integrating compression and execution in column-oriented database systems. In SIGMOD Conference, 2006. Google Scholar
D. J. Abadi, S. Madden, and N. Hachem. Column-stores vs. row-stores: how different are they really? In SIGMOD Conference, pages 967-980, 2008. Google Scholar
D. A. Alcantara, A. Sharf, F. Abbasinejad, S. Sengupta, M. Mitzenmacher, J. D. Owens, and N. Amenta. Real-time parallel hashing on the gpu. ACM Trans. Graph., 28(5), 2009. Google Scholar
N. Ao, F. Zhang, D. Wu, D. S. Stones, G. Wang, X. Liu, J. Liu, and S. Lin. Efficient parallel lists intersection and index compression algorithms using graphics processing units. PVLDB, 2011. Google Scholar
C. Balkesen, J. Teubner, G. Alonso, and T. Ozsu. Main-memory hash joins on multi-core cpus: Tuning to the underlying hardware. In ICDE, 2013. Google Scholar
N. Bandi, C. Sun, A. El Abbadi, and D. Agrawal. Hardware acceleration in commercial databases: A case study of spatial operations. In VLDB, 2004. Google Scholar
S. Blanas, Y. Li, and J. Patel. Design and evaluation of main memory hash join algorithms for multicore cpus. In SIGMOD, pages 37-48, 2011. Google Scholar
G. Candea, N. Polyzotis, and R. Vingralek. A scalable, predictable join operator for highly concurrent data warehouses. PVLDB, 2(1):277-288, 2009. Google Scholar
P. Du, R. Weber, P. Luszczek, S. Tomov, G. D. Peterson, and J. Dongarra. From cuda to opencl: Towards a performance-portable solution for multi-platform gpu programming. Parallel Computing, 38(8):391-407, 2012. Google Scholar
W. Fang, B. He, and Q. Luo. Database compression on graphics processors. In VLDB, 2010. Google Scholar
N. Govindaraju, J. Gray, R. Kumar, and D. Manocha. Gputerasort: high performance graphics coprocessor sorting for large database management. In SIGMOD, 2006. Google Scholar
N. K. Govindaraju, B. Lloyd, W. Wang, M. C. Lin, and D. Manocha. Fast computation of database operations using graphics processors. In SIGMOD Conference, 2004. Google Scholar
B. He, M. Liu, K. Yang, R. Fang, N. Govindaraju, Q. Luo, and P. Sander. Relational query coprocessing on graphics processors. ACM Transactions on Database Systems, 34(4), December 2009. Google Scholar
B. He, K. Yang, R. Fang, M. Liu, N. Govindaraju, Q. Luo, and P. Sander. Relational joins on graphics processors. In SIGMOD, pages 511-524, 2008. Google Scholar
B. He and J. X. Yu. High-throughput transaction executions on graphics processors. PVLDB, 2011. Google Scholar
S. Idreos, F. Groffen, N. Nes, S. Manegold, K. S. Mullender, and M. L. Kersten. Monetdb: Two decades of research in column-oriented database architectures. IEEE Data Eng. Bull., 35(1):40-45, 2012.Google Scholar
T. Kaldewey, G. Lohman, R. Mueller, and P. Volk. Gpu join processing revisited. In DaMoN, 2012. Google Scholar
R. Lee, T. Luo, Y. Huai, F. Wang, Y. He, and X. Zhang. Ysmart: Yet another sql-to-mapreduce translator. In ICDCS, pages 25-36, 2011. Google Scholar
M. D. Lieberman, J. Sankaranarayanan, and H. Samet. A fast similarity join algorithm using graphics processing units. In ICDE, 2008. Google Scholar
R. Motwani and P. Raghavan. Randomized Algorithms. Cambridge University Press, 1995. Google Scholar
P. O'Neil, E. O'Neil, X. Chen, and S. Revilak. Star schema benchmark. http://www.cs.umb.edu/~poneil/StarSchemaB.PDF.Google Scholar
H. Pirk, S. Manegold, and M. Kersten. Accelerating foreign-key joins using asymmetric memory channels. In ADMS, 2011.Google Scholar
N. Satish, C. Kim, J. Chhugani, A. Nguyen, V. Lee, D. Kim, and P. Dubey. Fast sort on cpus and gpus: a case for bandwidth oblivious simd sort. In SIGMOD, 2010. Google Scholar
E. Sitaridi and K. Ross. Ameliorating memory contention of olap operators on gpu processors. In DaMoN, pages 39-47, 2012. Google Scholar
M. Stonebraker, C. Bear, U. Çetintemel, M. Cherniack, T. Ge, N. Hachem, S. Harizopoulos, J. Lifter, J. Rogers, and S. B. Zdonik. One size fits all? part 2: Benchmarking studies. In CIDR, 2007.Google Scholar
K. Wang, Y. Huai, R. Lee, F. Wang, X. Zhang, and J. H. Saltz. Accelerating pathology image data cross-comparison on cpu-gpu hybrid systems. PVLDB, 5(11):1543-1554, 2012. Google Scholar
H. Wu, G. Diamos, S. Cadambi, and S. Yalamanchili. Kernel weaver: Automatically fusing database primitives for efficient gpu computation. In MICRO-45, 2012. Google Scholar

Index Terms

The Yin and Yang of processing data warehousing queries on GPU devices
1. Information systems
  1. Data management systems
    1. Database management system engines
      1. Database query processing
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Database theory
      1. Database query processing and optimization (theory)

Index terms have been assigned to the content through auto-classification.

Recommendations

GPU join processing revisited
DaMoN '12: Proceedings of the Eighth International Workshop on Data Management on New Hardware

Until recently, the use of graphics processing units (GPUs) for query processing was limited by the amount of memory on the graphics card, a few gigabytes at best. Moreover, input tables had to be copied to GPU memory before they could be processed, and ...
Read More
GPU Acceleration of Range Queries over Large Data Sets
BDCAT '19: Proceedings of the 6th IEEE/ACM International Conference on Big Data Computing, Applications and Technologies

Data management systems commonly use bitmap indices to increase the efficiency of querying scientific data. Bitmaps are usually highly compressible and can be queried directly using fast hardware-supported bitwise logical operations. The processing of ...
Read More
Data-intensive document clustering on graphics processing unit (GPU) clusters

Document clustering is a central method to mine massive amounts of data. Due to the explosion of raw documents generated on the Internet and the necessity to analyze them efficiently in various intelligent information systems, clustering techniques have ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

Proceedings of the VLDB Endowment Volume 6, Issue 10
August 2013
180 pages
ISSN:2150-8097
Issue’s Table of Contents
Sponsors
In-Cooperation
Publisher
VLDB Endowment
Publication History
- Published: 1 August 2013
Published in pvldb Volume 6, Issue 10
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 62
  Total Citations
  View Citations
- 590
  Total Downloads
- Downloads (Last 12 months)74
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

The Yin and Yang of processing data warehousing queries on GPU devices

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Index Terms

Recommendations

GPU join processing revisited

GPU Acceleration of Range Queries over Large Data Sets

Data-intensive document clustering on graphics processing unit (GPU) clusters

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

The Yin and Yang of processing data warehousing queries on GPU devices

Proceedings of the VLDB Endowment

Abstract

References

Cited By

Index Terms

Recommendations

GPU join processing revisited

GPU Acceleration of Range Queries over Large Data Sets

Data-intensive document clustering on graphics processing unit (GPU) clusters

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media