ABSTRACT
Nowadays, GPUs are widely used to accelerate many high performance computing applications. Energy conservation of such computing systems has become an important research topic. Dynamic voltage/frequency scaling (DVFS) is proved to be an appealing method for saving energy for traditional computing centers. However, there is still a lack of firsthand study on the effectiveness of GPU DVFS. This paper presents a thorough measurement study that aims to explore how GPU DVFS affects the system energy consumption. We conduct experiments on a real GPU platform with 37 benchmark applications. Our results show that GPU voltage/frequency scaling is an effective approach to conserving energy. For example, by scaling down the GPU core voltage and frequency, we have achieved an average of 19.28% energy reduction compared with the default setting, while giving up no more than 4% of performance. For all tested GPU applications, core voltage scaling is significantly effective to reduce system energy consumption. Meanwhile the effects of scaling core frequency and memory frequency depend on the characteristics of GPU applications.
- Y. Abe, H. Sasaki, M. Peres, K. Inoue, K. Murakami, and S. Kato. Power and performance analysis of gpu-accelerated systems. In HotPower12. ACM, 2012. Google ScholarDigital Library
- A. Bakhoda, G. L. Yuan, W. W. Fung, H. Wong, and T. M. Aamodt. Analyzing cuda workloads using a detailed gpu simulator. In Performance Analysis of Systems and Software, 2009. ISPASS 2009. IEEE International Symposium on, pages 163--174. IEEE, 2009.Google ScholarCross Ref
- S. Che, M. Boyer, J. Meng, D. Tarjan, J. W. Sheaffer, S.-H. Lee, and K. Skadron. Rodinia: A benchmark suite for heterogeneous computing. In Workload Characterization, 2009. IISWC 2009. IEEE International Symposium on, pages 44--54. IEEE, 2009. Google ScholarDigital Library
- X. Chu and K. Zhao. Practical random linear network coding on gpus. In GPU Solutions to Multi-scale Problems in Science and Engineering, pages 115--130. Springer, 2013.Google ScholarCross Ref
- X. Chu, K. Zhao, and M. Wang. Massively parallel network coding on gpus. In Performance, Computing and Communications Conference, 2008. IPCCC 2008. IEEE International, pages 144--151. IEEE, 2008.Google Scholar
- H. David, C. Fallin, E. Gorbatov, U. R. Hanebutte, and O. Mutlu. Memory power management via dynamic voltage/frequency scaling. In 8th ACM international conference on Autonomic computing, pages 31--40. ACM, 2011. Google ScholarDigital Library
- M. Etinski, J. Corbalan, J. Labarta, and M. Valero. Understanding the future of energy-performance trade-off via dvfs in hpc environments. Journal of Parallel and Distributed Computing, 72(4): 579--590, 2012. Google ScholarDigital Library
- R. Ge, R. Vogt, J. Majumder, A. Alam, M. Burtscher, and Z. Zong. Effects of dynamic voltage and frequency scaling on a k20 gpu. In 2nd International Workshop on Power-aware Algorithms, Systems, and Architectures. IEEE, 2013.Google ScholarDigital Library
- S. Hong. Modeling performance and power for energy-efficient gpgpu computing. PhD thesis, Georgia Institute of Technology, 2012.Google Scholar
- S. Hong and H. Kim. An analytical model for a gpu architecture with memory-level and thread-level parallelism awareness. In ACM SIGARCH Computer Architecture News, volume 37, pages 152--163. ACM, 2009. Google ScholarDigital Library
- S. Hong and H. Kim. An integrated gpu power and performance model. In ACM SIGARCH Computer Architecture News, volume 38, pages 280--289. ACM, 2010. Google ScholarDigital Library
- Y. Jiao, H. Lin, P. Balaji, and W. Feng. Power and performance characterization of computational kernels on the gpu. In Green Computing and Communications (Green-Com), 2010 IEEE/ACM Int'l Conference on & Int'l Conference on Cyber, Physical and Social Computing (CPSCom), pages 221--228. IEEE, 2010. Google ScholarDigital Library
- J. Lee and N. S. Kim. Optimizing total power of many-core processors considering voltage scaling limit and process variations. In 14th ACM/IEEE international symposium on Low power electronics and design, pages 201--206. ACM, 2009. Google ScholarDigital Library
- J. Lee, V. Sathisha, M. Schulte, K. Compton, and N. S. Kim. Improving throughput of power-constrained gpus using dynamic voltage/frequency and core scaling. In Parallel Architectures and Compilation Techniques (PACT), 2011 International Conference on, pages 111--120. IEEE, 2011. Google ScholarDigital Library
- J. Leng, T. Hetherington, A. ElTantawy, S. Gilani, N. S. Kim, T. M. Aamodt, and V. J. Reddi. Gpuwattch: Enabling energy optimizations in gpgpus. In ISCA, volume 40, 2013. Google ScholarDigital Library
- Y. Li, K. Zhao, X. Chu, and J. Liu. Speeding up k-means algorithm by gpus. Journal of Computer and System Sciences, 79(2): 216--229, 2013. Google ScholarDigital Library
- M. Y. Lim and V. W. Freeh. Determining the minimum energy consumption using dynamic voltage and frequency scaling. In Parallel and Distributed Processing Symposium, pages 1--8. IEEE, 2007.Google ScholarCross Ref
- MSI. Afterburner, graphics card performance booster. http://event.msi.com/vga/afterburner/download.htm.Google Scholar
- H. Nagasaka, N. Maruyama, A. Nukada, T. Endo, and S. Matsuoka. Statistical power modeling of gpu kernels using performance counters. In Green Computing Conference, 2010 International, pages 115--122. IEEE, 2010. Google ScholarDigital Library
- NVIDIA. Gpu computing sdk. https://developer.nvidia.com/gpu-computing-sdk.Google Scholar
- Orbmu2k. Nvidia inspector. http://blog.orbmu2k.de/tools/nvidia-inspector-tool.Google Scholar
- TechPowerUp. Gpu-z. http://www.techpowerup.com/gpuz/.Google Scholar
Index Terms
- A measurement study of GPU DVFS on energy conservation
Recommendations
Analysis of Parallel Algorithms for Energy Conservation with GPU
GREENCOM-CPSCOM '10: Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social ComputingGPU has recently gained considerable attention in getting significant performance, for application raging from scientific computing to database sorting and search. General-purpose computing on GPU can easily reduce the execution time but results in an ...
On the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing
SAAHPC '11: Proceedings of the 2011 Symposium on Application Accelerators in High-Performance ComputingThe graphics processing unit (GPU) has made significant strides as an accelerator in parallel computing. However, because the GPU has resided out on PCIe as a discrete device, the performance of GPU applications can be bottlenecked by data transfers ...
Roofline-aware DVFS for GPUs
ADAPT '14: Proceedings of International Workshop on Adaptive Self-tuning Computing SystemsGraphics processing units (GPUs) are becoming increasingly popular for compute workloads, mainly because of their large number of processing elements and high-bandwidth to off-chip memory. The roofline model captures the ratio between the two (the ...
Comments