skip to main content
10.1145/2771937.2771939acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Energy-Efficient Query Processing on Embedded CPU-GPU Architectures

Authors Info & Claims
Published:31 May 2015Publication History

ABSTRACT

Energy efficiency is a major design and optimization factor for query co-processing of databases in embedded devices. Recently, GPUs of new-generation embedded devices have evolved with the programmability and computational capability for general-purpose applications. Such CPU-GPU architectures offer us opportunities to revisit GPU query co-processing in embedded environments for energy efficiency. In this paper, we experimentally evaluate and analyze the performance and energy consumption of a GPU query co-processor on such hybrid embedded architectures. Specifically, we study four major database operators as micro-benchmarks and evaluate TPC-H queries on CARMA, which has a quad-core ARM Cortex-A9 CPU and a NVIDIA Quadro 1000M GPU. We observe that the CPU delivers both better performance and lower energy consumption than the GPU for simple operators such as selection and aggregation. However, the GPU outperforms the CPU for sort and hash join in terms of both performance and energy consumption. We further show that CPU-GPU query co-processing can be an effective means of energy-efficient query co-processing in embedded systems with proper tuning and optimizations.

References

  1. D. G. Andersen, J. Franklin, M. Kaminsky, A. Phanishayee, L. Tan, and V. Vasudevan. Fawn: A fast array of wimpy nodes. In Proceedings of the ACM SIGOPS 22nd symposium on Operating Systems Principles, pages 1--14. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. C. Balkesen, G. Alonso, J. Teubner, and M. T. Ozsu. Multi-core, main-memory joins: Sort vs. hash revisited. Proceedings of the VLDB Endowment, 7(1):85--96, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. B. M. T. Dumitrel Loghin, H. Zhang, B. C. Ooi, and Y. M. Teo. A performance study of big data on small nodes. Proceedings of the VLDB Endowment, 8(7), 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. Fang, B. He, M. Lu, K. Yang, N. K. Govindaraju, Q. Luo, and P. V. Sander. Gpuqp: query co-processing using graphics processors. In Proceedings of the 2007 ACM SIGMOD international conference on Management of data, pages 1061--1063. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Y. Gu and R. Grossman. Udtv4: Improvements in performance and usability. In Networks for Grid Applications, pages 9--23. Springer, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  6. J. Gubbi, R. Buyya, S. Marusic, and M. Palaniswami. Internet of things (iot): A vision, architectural elements, and future directions. Future Generation Computer Systems, 29(7):1645--1660, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B. He, M. Lu, K. Yang, R. Fang, N. K. Govindaraju, Q. Luo, and P. V. Sander. Relational query coprocessing on graphics processors. ACM Transactions on Database Systems (TODS), 34(4):21, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. He, M. Lu, and B. He. Revisiting co-processing for hash joins on the coupled cpu-gpu architecture. Proceedings of the VLDB Endowment, 6(10):889--900, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Jha, B. He, M. Lu, X. Cheng, and P. H. Huynh. Improving main memory hash joins on intel xeon phi processors: An experimental approach. Proceedings of the VLDB Endowment, 8(6), 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. T. Kaldewey, G. Lohman, R. Mueller, and P. Volk. Gpu join processing revisited. In Proceedings of the Eighth International Workshop on Data Management on New Hardware, pages 55--62. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. W. Lang, S. Harizopoulos, J. M. Patel, M. A. Shah, and D. Tsirogiannis. Towards energy-efficient database cluster design. Proceedings of the VLDB Endowment, 5(11):1684--1695, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. K. Ma, X. Li, W. Chen, C. Zhang, and X. Wang. Greengpu: A holistic approach to energy efficiency in gpu-cpu heterogeneous architectures. In Parallel Processing (ICPP), 2012 41st International Conference on, pages 48--57. IEEE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. F. Mantovani. High performance computing based on embedded processors. In High Performance Computing & Simulation (HPCS), 2014 International Conference on, pages 1034--1034. IEEE, 2014.Google ScholarGoogle Scholar
  14. T. Mühlbauer, W. Rödiger, R. Seilbeck, A. Reiser, A. Kemper, and T. Neumann. One dbms for all: the brawny few and the wimpy crowd. In Proceedings of the 2014 ACM SIGMOD international conference on Management of data, pages 697--700. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Pathania, Q. Jiao, A. Prakash, and T. Mitra. Integrated cpu-gpu power management for 3d mobile games. In Design Automation Conference (DAC), 2014 51st ACM/EDAC/IEEE, pages 1--6. IEEE, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. H. Peters, O. Schulz-Hildebrandt, and N. Luttenberger. A novel sorting algorithm for many-core architectures based on adaptive bitonic sort. In Parallel & Distributed Processing Symposium (IPDPS), 2012 IEEE 26th International, pages 227--237. IEEE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. Schall and T. Härder. Energy-proportional query execution using a cluster of wimpy nodes. In Proceedings of the Ninth International Workshop on Data Management on New Hardware, page 1. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. D. Schall and V. Hudlet. Wattdb: an energy-proportional cluster of wimpy nodes. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, pages 1229--1232. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. Tsirogiannis, S. Harizopoulos, and M. A. Shah. Analyzing the energy efficiency of a database server. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pages 231--242. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. V. Vasudevan, L. Tan, M. Kaminsky, M. A. Kozuch, D. Andersen, and P. Pillai. Fawnsort: Energy-efficient sorting of 10gb. Sort Benchmark final, 2010.Google ScholarGoogle Scholar
  21. D. Wong and M. Annavaram. Knightshift: scaling the energy proportionality wall through server-level heterogeneity. In Microarchitecture (MICRO), 2012 45th Annual IEEE/ACM International Symposium on, pages 119--130. IEEE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Y. Yuan, R. Lee, and X. Zhang. The yin and yang of processing data warehousing queries on gpu devices. Proceedings of the VLDB Endowment, 6(10):817--828, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. Zhang, J. He, B. He, and M. Lu. Omnidb: Towards portable and efficient query processing on parallel cpu/gpu architectures. Proceedings of the VLDB Endowment, 6(12):1374--1377, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Energy-Efficient Query Processing on Embedded CPU-GPU Architectures

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              DaMoN'15: Proceedings of the 11th International Workshop on Data Management on New Hardware
              May 2015
              100 pages
              ISBN:9781450336383
              DOI:10.1145/2771937

              Copyright © 2015 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 31 May 2015

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed limited

              Acceptance Rates

              DaMoN'15 Paper Acceptance Rate12of16submissions,75%Overall Acceptance Rate80of102submissions,78%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader