Abstract
GPU acceleration is a promising approach to speed up query processing of database systems by using low cost graphic processors as coprocessors. Two major trends have emerged in this area: (1) The development of frameworks for scheduling tasks in heterogeneous CPU/GPU platforms, which is mainly in the context of coprocessing for applications and does not consider specifics of database-query processing and optimization. (2) The acceleration of database operations using efficient GPU algorithms, which typically cannot be applied easily on other database systems, because of their analytical-algorithm-specific cost models. One major challenge is how to combine traditional database query processing with GPU coprocessing techniques and efficient database operation scheduling in a GPU-aware query optimizer. In this thesis, we develop a hybrid query processing engine, which extends the traditional physical optimization process to generate hybrid query plans and to perform a cost-based optimization in a way that the advantages of CPUs and GPUs are combined. Furthermore, we aim at a portable solution between different GPU-accelerated database management systems to maximize applicability. Preliminary results indicate great potential.
- C. Augonnet, S. Thibault, R. Namyst, and P.-A. Wacrenier. StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures. Concurrency and Computation: Practice & Experience, 23(2):187-198, 2011. Google Scholar
- P. Bakkum and S. Chakradhar. Efficient Data Management for GPU Databases. 2012. http://pbbakkum.com/virginian/paper.pdf.Google Scholar
- P. Bakkum and K. Skadron. Accelerating SQL Database Operations on a GPU with CUDA. In GPGPU, pages 94-103. ACM, 2010. Google Scholar
- F. Beier, T. Kilias, and K.-U. Sattler. GiST Scan Acceleration using Coprocessors. In DaMoN, pages 63-69. ACM, 2012. Google Scholar
- S. Breß, F. Beier, H. Rauhe, K.-U. Sattler, E. Schallehn, and G. Saake. Efficient Co-Processor Utilization in Database Query Processing. Information Systems, 38(8):1084-1096, 2013.Google Scholar
- S. Breß, F. Beier, H. Rauhe, E. Schallehn, K.-U. Sattler, and G. Saake. Automatic Selection of Processing Units for Coprocessing in Databases. In ADBIS, pages 57-70. Springer, 2012. Google Scholar
- S. Breß, I. Geist, E. Schallehn, M. Mory, and G. Saake. A Framework for Cost based Optimization of Hybrid CPU/GPU Query Plans in Database Systems. Control and Cybernetics, 41(4):715-742, 2012.Google Scholar
- S. Breß, S. Mohammad, and E. Schallehn. Self-Tuning Distribution of DB-Operations on Hybrid CPU/GPU Platforms. In GvD, pages 89-94. CEUR-WS, 2012.Google Scholar
- S. Breß, E. Schallehn, and I. Geist. Towards Optimization of Hybrid CPU/GPU Query Plans in Database Systems. In GID, pages 27-35. Springer, 2012.Google Scholar
- G. Diamos, H. Wu, A. Lele, J. Wang, and S. Yalamanchili. Efficient Relational Algebra Algorithms and Data Structures for GPU. Technical report, Center for Experimental Research in Computer Systems (CERS), 2012.Google Scholar
- B. Forchhammer, T. Papenbrock, T. Stening, S. Viehmeier, U. Draisbach, and F. Naumann. Duplicate Detection on GPUs. In BTW, pages 165-184. Köllen-Verlag, 2013.Google Scholar
- V. Garcia, E. Debreuve, and M. Barlaud. Fast k Nearest Neighbor Search using GPU. In CVPRW, pages 1-6. IEEE, 2008.Google Scholar
- P. Ghodsnia. An In-GPU-Memory Column-Oriented Database for Processing Analytical Workloads. In The VLDB PhD Workshop. VLDB Endowment, 2012.Google Scholar
- N. Govindaraju, J. Gray, R. Kumar, and D. Manocha. GPUTeraSort: High Performance Graphics Coprocessor Sorting for Large Database Management. In SIGMOD, pages 325-336. ACM, 2006. Google Scholar
- N. K. Govindaraju, B. Lloyd, W. Wang, M. Lin, and D. Manocha. Fast Computation of Database Operations using Graphics Processors. In SIGMOD, pages 215-226. ACM, 2004. Google Scholar
- C. Gregg and K. Hazelwood. Where is the Data? Why You Cannot Debate CPU vs. GPU Performance Without the Answer. In ISPASS, pages 134-144. IEEE, 2011. Google Scholar
- B. He, W. Fang, Q. Luo, N. K. Govindaraju, and T. Wang. Mars: A MapReduce Framework on Graphics Processors. In PACT, pages 260-269. ACM, 2008. Google Scholar
- B. He, M. Lu, K. Yang, R. Fang, N. K. Govindaraju, Q. Luo, and P. V. Sander. Relational Query Co-Processing on Graphics Processors. In ACM Trans. Database Syst., volume 34. pp. 21:1-21:39. ACM, 2009. Google Scholar
- B. He, K. Yang, R. Fang, M. Lu, N. Govindaraju, Q. Luo, and P. Sander. Relational joins on graphics processors. In SIGMOD, pages 511-524. ACM, 2008. Google Scholar
- B. He and J. X. Yu. High-Throughput Transaction Executions on Graphics Processors. PVLDB, 4(5):314-325, 2011. Google Scholar
- M. Heimel. Investigating Query Optimization for a GPU-accelerated Database. Master's thesis, Technische Universität Berlin, Electrical Engineering and Computer Science, Department of Software Engineering and Theoretical Computer Science, 2011.Google Scholar
- M. Heimel, M. Saecker, H. Pirk, S. Manegold, and V. Markl. Hardware-oblivious parallelism for in-memory column-stores. In VLDB. VLDB Endowment, 2013.Google Scholar
- A. Ilic, F. Pratas, P. Trancoso, and L. Sousa. High Performance Scientific Computing with Special Emphasis on Current Capabilities and Future Perspectives, chapter High-Performance Computing on Heterogeneous Systems: Database Queries on CPU and GPU, pages 202-222. IOS Press, 2011.Google Scholar
- A. Ilic and L. Sousa. CHPS: An Environment for Collaborative Execution on Heterogeneous Desktop Systems. International Journal of Networking and Computing, 1(1):96-113, 2011.Google Scholar
- M. Iverson, F. Ozguner, and L. Potter. Statistical Prediction of Task Execution Times Through Analytic Benchmarking for Scheduling in a Heterogeneous Environment. In HCW, pages 99-111, 1999. Google Scholar
- K. Kaczmarski. Comparing GPU and CPU in OLAP Cubes Creation. In SOFSEM, pages 308-319. Springer, 2011. Google Scholar
- A. Kerr, G. Diamos, and S. Yalamanchili. Modeling GPU-CPU Workloads and Systems. GPGPU, pages 31-42. ACM, 2010. Google Scholar
- C. Kim, J. Chhugani, N. Satish, E. Sedlar, A. D. Nguyen, T. Kaldewey, V. W. Lee, S. A. Brandt, and P. Dubey. FAST: Fast Architecture Sensitive Tree Search on Modern CPUs and GPUs. In SIGMOD, pages 339-350. ACM, 2010. Google Scholar
- T. Lauer, A. Datta, Z. Khadikov, and C. Anselm. Exploring Graphics Processing Units as Parallel Coprocessors for Online Aggregation. In DOLAP, pages 77-84. ACM, 2010. Google Scholar
- M. Malik, L. Riha, C. Shea, and T. El-Ghazawi. Task Scheduling for GPU Accelerated Hybrid OLAP Systems with Multi-core Support and Text-to-Integer Translation. In 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), pages 1987-1996. IEEE, 2012. Google Scholar
- R. Moussalli, R. Halstead, M. Salloum, W. Najjar, and V. J. Tsotras. Efficient XML Path Filtering Using GPUs. In ADMS, 2011.Google Scholar
- NVIDIA. NVIDIA CUDA C Programming Guide. http://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf, 2013. pp. 30-34, Version 5.0, [Online; accessed 16-Feb-2013].Google Scholar
- H. Pirk. Efficient Cross-Device Query Processing. In The VLDB PhD Workshop. VLDB Endowment, 2012.Google Scholar
- H. Pirk, S. Manegold, and M. Kersten. Accelerating foreign-key joins using asymmetric memory channels. pages 585-597, 2011.Google Scholar
- H. Pirk, T. Sellam, S. Manegold, and M. Kersten. X-Device Query Processing by Bitwise Distribution. In DaMoN, pages 48-54. ACM, 2012. Google Scholar
- L. Riha, M. Malik, and T. El-Ghazawi. An Adaptive Hybrid OLAP Architecture with optimized Memory Access Patterns. In Cluster Computing, pages 1-15. Springer, 2012.Google Scholar
- J. Sanders and E. Kandrot. CUDA by Example: An Introduction to General-Purpose GPU Programming. pages 2-6, 186, Addison-Wesley Professional, 1st edition, 2010. Google Scholar
- G. Wang and G. Zhou. GPU-Based Aggregation of On-Line Analytical Processing. In ICCIP, pages 234-245. Springer, 2012.Google Scholar
- W. Wang and L. Cao. Parallel k-Nearest Neighbor Search on Graphics Hardware. In PAAP, pages 291-294. IEEE, 2010. Google Scholar
- W. Wu, Y. Chi, S. Zhu, J. Tatemura, H. Hacigümüs, and J. F. Naughton. Predicting Query Execution Time: Are Optimizer Cost Models Really Unusable? In ICDE. IEEE, 2013.Google Scholar
Index Terms
- Why it is time for a HyPE: a hybrid query processing engine for efficient GPU coprocessing in DBMS
Recommendations
Ocelot/HyPE: optimized data processing on heterogeneous hardware
The past years saw the emergence of highly heterogeneous server architectures that feature multiple accelerators in addition to the main processor. Efficiently exploiting these systems for data processing is a challenging research problem that comprises ...
Computing prestack Kirchhoff time migration on general purpose GPU
This paper introduces how to optimize a practical prestack Kirchhoff time migration program by the Compute Unified Device Architecture (CUDA) on a general purpose GPU (GPGPU). A few useful optimization methods on GPGPU are demonstrated, such as how to ...
A Practical Approach of Curved Ray Prestack Kirchhoff Time Migration on GPGPU
APPT '09: Proceedings of the 8th International Symposium on Advanced Parallel Processing TechnologiesWe introduced four prototypes of General Purpose GPU solutions by Compute Unified Device Architecture (CUDA) on NVidia GeForce 8800GT and Tesla C870 for a practical Curved Ray Prestack Kirchhoff Time Migration program, which is one of the most widely ...
Comments