skip to main content
article

Why it is time for a HyPE: a hybrid query processing engine for efficient GPU coprocessing in DBMS

Published:01 August 2013Publication History
Skip Abstract Section

Abstract

GPU acceleration is a promising approach to speed up query processing of database systems by using low cost graphic processors as coprocessors. Two major trends have emerged in this area: (1) The development of frameworks for scheduling tasks in heterogeneous CPU/GPU platforms, which is mainly in the context of coprocessing for applications and does not consider specifics of database-query processing and optimization. (2) The acceleration of database operations using efficient GPU algorithms, which typically cannot be applied easily on other database systems, because of their analytical-algorithm-specific cost models. One major challenge is how to combine traditional database query processing with GPU coprocessing techniques and efficient database operation scheduling in a GPU-aware query optimizer. In this thesis, we develop a hybrid query processing engine, which extends the traditional physical optimization process to generate hybrid query plans and to perform a cost-based optimization in a way that the advantages of CPUs and GPUs are combined. Furthermore, we aim at a portable solution between different GPU-accelerated database management systems to maximize applicability. Preliminary results indicate great potential.

References

  1. C. Augonnet, S. Thibault, R. Namyst, and P.-A. Wacrenier. StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures. Concurrency and Computation: Practice & Experience, 23(2):187-198, 2011. Google ScholarGoogle Scholar
  2. P. Bakkum and S. Chakradhar. Efficient Data Management for GPU Databases. 2012. http://pbbakkum.com/virginian/paper.pdf.Google ScholarGoogle Scholar
  3. P. Bakkum and K. Skadron. Accelerating SQL Database Operations on a GPU with CUDA. In GPGPU, pages 94-103. ACM, 2010. Google ScholarGoogle Scholar
  4. F. Beier, T. Kilias, and K.-U. Sattler. GiST Scan Acceleration using Coprocessors. In DaMoN, pages 63-69. ACM, 2012. Google ScholarGoogle Scholar
  5. S. Breß, F. Beier, H. Rauhe, K.-U. Sattler, E. Schallehn, and G. Saake. Efficient Co-Processor Utilization in Database Query Processing. Information Systems, 38(8):1084-1096, 2013.Google ScholarGoogle Scholar
  6. S. Breß, F. Beier, H. Rauhe, E. Schallehn, K.-U. Sattler, and G. Saake. Automatic Selection of Processing Units for Coprocessing in Databases. In ADBIS, pages 57-70. Springer, 2012. Google ScholarGoogle Scholar
  7. S. Breß, I. Geist, E. Schallehn, M. Mory, and G. Saake. A Framework for Cost based Optimization of Hybrid CPU/GPU Query Plans in Database Systems. Control and Cybernetics, 41(4):715-742, 2012.Google ScholarGoogle Scholar
  8. S. Breß, S. Mohammad, and E. Schallehn. Self-Tuning Distribution of DB-Operations on Hybrid CPU/GPU Platforms. In GvD, pages 89-94. CEUR-WS, 2012.Google ScholarGoogle Scholar
  9. S. Breß, E. Schallehn, and I. Geist. Towards Optimization of Hybrid CPU/GPU Query Plans in Database Systems. In GID, pages 27-35. Springer, 2012.Google ScholarGoogle Scholar
  10. G. Diamos, H. Wu, A. Lele, J. Wang, and S. Yalamanchili. Efficient Relational Algebra Algorithms and Data Structures for GPU. Technical report, Center for Experimental Research in Computer Systems (CERS), 2012.Google ScholarGoogle Scholar
  11. B. Forchhammer, T. Papenbrock, T. Stening, S. Viehmeier, U. Draisbach, and F. Naumann. Duplicate Detection on GPUs. In BTW, pages 165-184. Köllen-Verlag, 2013.Google ScholarGoogle Scholar
  12. V. Garcia, E. Debreuve, and M. Barlaud. Fast k Nearest Neighbor Search using GPU. In CVPRW, pages 1-6. IEEE, 2008.Google ScholarGoogle Scholar
  13. P. Ghodsnia. An In-GPU-Memory Column-Oriented Database for Processing Analytical Workloads. In The VLDB PhD Workshop. VLDB Endowment, 2012.Google ScholarGoogle Scholar
  14. N. Govindaraju, J. Gray, R. Kumar, and D. Manocha. GPUTeraSort: High Performance Graphics Coprocessor Sorting for Large Database Management. In SIGMOD, pages 325-336. ACM, 2006. Google ScholarGoogle Scholar
  15. N. K. Govindaraju, B. Lloyd, W. Wang, M. Lin, and D. Manocha. Fast Computation of Database Operations using Graphics Processors. In SIGMOD, pages 215-226. ACM, 2004. Google ScholarGoogle Scholar
  16. C. Gregg and K. Hazelwood. Where is the Data? Why You Cannot Debate CPU vs. GPU Performance Without the Answer. In ISPASS, pages 134-144. IEEE, 2011. Google ScholarGoogle Scholar
  17. B. He, W. Fang, Q. Luo, N. K. Govindaraju, and T. Wang. Mars: A MapReduce Framework on Graphics Processors. In PACT, pages 260-269. ACM, 2008. Google ScholarGoogle Scholar
  18. B. He, M. Lu, K. Yang, R. Fang, N. K. Govindaraju, Q. Luo, and P. V. Sander. Relational Query Co-Processing on Graphics Processors. In ACM Trans. Database Syst., volume 34. pp. 21:1-21:39. ACM, 2009. Google ScholarGoogle Scholar
  19. B. He, K. Yang, R. Fang, M. Lu, N. Govindaraju, Q. Luo, and P. Sander. Relational joins on graphics processors. In SIGMOD, pages 511-524. ACM, 2008. Google ScholarGoogle Scholar
  20. B. He and J. X. Yu. High-Throughput Transaction Executions on Graphics Processors. PVLDB, 4(5):314-325, 2011. Google ScholarGoogle Scholar
  21. M. Heimel. Investigating Query Optimization for a GPU-accelerated Database. Master's thesis, Technische Universität Berlin, Electrical Engineering and Computer Science, Department of Software Engineering and Theoretical Computer Science, 2011.Google ScholarGoogle Scholar
  22. M. Heimel, M. Saecker, H. Pirk, S. Manegold, and V. Markl. Hardware-oblivious parallelism for in-memory column-stores. In VLDB. VLDB Endowment, 2013.Google ScholarGoogle Scholar
  23. A. Ilic, F. Pratas, P. Trancoso, and L. Sousa. High Performance Scientific Computing with Special Emphasis on Current Capabilities and Future Perspectives, chapter High-Performance Computing on Heterogeneous Systems: Database Queries on CPU and GPU, pages 202-222. IOS Press, 2011.Google ScholarGoogle Scholar
  24. A. Ilic and L. Sousa. CHPS: An Environment for Collaborative Execution on Heterogeneous Desktop Systems. International Journal of Networking and Computing, 1(1):96-113, 2011.Google ScholarGoogle Scholar
  25. M. Iverson, F. Ozguner, and L. Potter. Statistical Prediction of Task Execution Times Through Analytic Benchmarking for Scheduling in a Heterogeneous Environment. In HCW, pages 99-111, 1999. Google ScholarGoogle Scholar
  26. K. Kaczmarski. Comparing GPU and CPU in OLAP Cubes Creation. In SOFSEM, pages 308-319. Springer, 2011. Google ScholarGoogle Scholar
  27. A. Kerr, G. Diamos, and S. Yalamanchili. Modeling GPU-CPU Workloads and Systems. GPGPU, pages 31-42. ACM, 2010. Google ScholarGoogle Scholar
  28. C. Kim, J. Chhugani, N. Satish, E. Sedlar, A. D. Nguyen, T. Kaldewey, V. W. Lee, S. A. Brandt, and P. Dubey. FAST: Fast Architecture Sensitive Tree Search on Modern CPUs and GPUs. In SIGMOD, pages 339-350. ACM, 2010. Google ScholarGoogle Scholar
  29. T. Lauer, A. Datta, Z. Khadikov, and C. Anselm. Exploring Graphics Processing Units as Parallel Coprocessors for Online Aggregation. In DOLAP, pages 77-84. ACM, 2010. Google ScholarGoogle Scholar
  30. M. Malik, L. Riha, C. Shea, and T. El-Ghazawi. Task Scheduling for GPU Accelerated Hybrid OLAP Systems with Multi-core Support and Text-to-Integer Translation. In 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), pages 1987-1996. IEEE, 2012. Google ScholarGoogle Scholar
  31. R. Moussalli, R. Halstead, M. Salloum, W. Najjar, and V. J. Tsotras. Efficient XML Path Filtering Using GPUs. In ADMS, 2011.Google ScholarGoogle Scholar
  32. NVIDIA. NVIDIA CUDA C Programming Guide. http://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf, 2013. pp. 30-34, Version 5.0, [Online; accessed 16-Feb-2013].Google ScholarGoogle Scholar
  33. H. Pirk. Efficient Cross-Device Query Processing. In The VLDB PhD Workshop. VLDB Endowment, 2012.Google ScholarGoogle Scholar
  34. H. Pirk, S. Manegold, and M. Kersten. Accelerating foreign-key joins using asymmetric memory channels. pages 585-597, 2011.Google ScholarGoogle Scholar
  35. H. Pirk, T. Sellam, S. Manegold, and M. Kersten. X-Device Query Processing by Bitwise Distribution. In DaMoN, pages 48-54. ACM, 2012. Google ScholarGoogle Scholar
  36. L. Riha, M. Malik, and T. El-Ghazawi. An Adaptive Hybrid OLAP Architecture with optimized Memory Access Patterns. In Cluster Computing, pages 1-15. Springer, 2012.Google ScholarGoogle Scholar
  37. J. Sanders and E. Kandrot. CUDA by Example: An Introduction to General-Purpose GPU Programming. pages 2-6, 186, Addison-Wesley Professional, 1st edition, 2010. Google ScholarGoogle Scholar
  38. G. Wang and G. Zhou. GPU-Based Aggregation of On-Line Analytical Processing. In ICCIP, pages 234-245. Springer, 2012.Google ScholarGoogle Scholar
  39. W. Wang and L. Cao. Parallel k-Nearest Neighbor Search on Graphics Hardware. In PAAP, pages 291-294. IEEE, 2010. Google ScholarGoogle Scholar
  40. W. Wu, Y. Chi, S. Zhu, J. Tatemura, H. Hacigümüs, and J. F. Naughton. Predicting Query Execution Time: Are Optimizer Cost Models Really Unusable? In ICDE. IEEE, 2013.Google ScholarGoogle Scholar

Index Terms

  1. Why it is time for a HyPE: a hybrid query processing engine for efficient GPU coprocessing in DBMS
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image Proceedings of the VLDB Endowment
        Proceedings of the VLDB Endowment  Volume 6, Issue 12
        August 2013
        264 pages

        Publisher

        VLDB Endowment

        Publication History

        • Published: 1 August 2013
        Published in pvldb Volume 6, Issue 12

        Qualifiers

        • article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader