Top

The Journal of Supercomputing

Published in:

01-04-2016

Joint frequency scaling of processor and DRAM

Authors: Vaibhav Sundriyal, Masha Sosonkina

Published in: The Journal of Supercomputing | Issue 4/2016

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Energy efficiency and energy-proportional computing have become a central focus in modern supercomputers. Many previous energy-saving strategies have focused solely on the CPU while the DRAM subsystem has not been addressed sufficiently, even though memory consumes about 20 % of the total power in a typical server platform. This paper describes a novel runtime system that scales the frequency of both processor and DRAM-based on the performance and power models, also proposed here. Specifically, first, a performance-loss constraint is chosen for an application, then, an optimal processor–DRAM frequency pair is modeled such that the pair minimizes the energy consumption in a given timeslice. Experiments performed on SPEC CPU™ 2006, NAS NPB, and pARMS benchmarks demonstrate that the proposed runtime system may obtain total energy savings both for memory- and compute-intensive applications. In particular, as much as 22 % of energy was saved with a low performance loss of about 4.8 %.

previous article Automatic parallelization of XQuery programs on multi-core systems

next article Soft error resilience in Big Data kernels through modular analysis

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

TOP500 list: http://top500.org/.

Authors’ previous work [31] outlines the pitfalls of the models relying on the user-defined performance-loss tolerance and introduces a model based on instantaneous power consumption.

LMBench web-site: http://www.bitmover.com/lmbench/.

See, e.g., http://www.anandtech.com/show/6355/intels-haswell-architecture/8.

Wattsup meter: https://www.wattsupmeters.com.

SPEC CPU™ 2006 benchmarks web-site: https://www.spec.org/cpu2006/.

Begum R, Werner D, Hempstead M, Prasad G, Challen G (2015) Energy-performance trade-offs on energy-constrained devices with multi-component DVFS. In: Workload Characterization (IISWC), 2015 IEEE International Symposium on, pp 34–43, Oct 2015

Borkar S (2001) The exascale challenge, 2011. Keynote speech. In: the 12th International Conference on Parallel Architectures and Compilation Techniques

Chen YJ, Yang CL, Lin PS, Lu YC (2015) Thermal/performance characterization of CMPs with 3D-stacked DRAMs under synergistic voltage-frequency control of cores and DRAMs. In: Proceedings of the 2015 Conference on Research in Adaptive and Convergent Systems, RACS, pp 430–436, New York, NY, USA, 2015. ACM

David H, Fallin C, Gorbatov E, Hanebutte UR, Mutlu O (2011) Memory power management via dynamic voltage/frequency scaling. In: Proceedings of the 8th ACM International Conference on Autonomic Computing, pp 31–40

Deng Q, Meisner D, Bhattacharjee A, Wenisch TF, Bianchini R (2012) Coscale: coordinating cpu and memory system DVFS in server systems. In: Microarchitecture (MICRO), 2012 45th Annual IEEE/ACM International Symposium on, pp 143–154, Dec 2012

Etinski M, Corbalan J, Labarta J, Valero M, Veidenbaum A (2009) Power-aware load balancing of large scale MPI applications. In Parallel Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on, pp 1–8, May 2009

Freeh VW, Lowenthal DK (2005) Using multiple energy gears in MPI programs on a power-scalable cluster. In: Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming, pp 164–173

Ge R, Feng X, Feng W, Cameron KW (2007) CPU MISER: A performance-directed, run-time system for power-aware clusters. In: Parallel Processing, 2007. ICPP 2007. International Conference on, pp 18, Sep. 2007

Ge R, Feng X, Song S, Chang HC, Li D, Cameron KW (2010) PowerPack: energy profiling and analysis of high-performance systems and applications. Parallel Distrib Syst IEEE Trans 21:658–671CrossRef

10.

Gonzales R, Horowitz M (1995) Energy dissipation in general purpose processors. IEEE J Solid State Circuits 31:1277–1284CrossRef

11.

Hackenberg D, Schone R, Ilsche T, Molka D, Schuchart J, Geyer R (2015) An energy efficiency feature survey of the intel haswell processor. In: Parallel and Distributed Processing Symposium Workshop (IPDPSW), 2015 IEEE International, pp 896–904, May 2015

12.

Hennessy JL, Patterson DA (2011) Computer architecture: a quantitative approach (appendix B), 5th edn. Morgan Kaufmann Publishers Inc., San FranciscoMATH

13.

Henning JL (2006) SPEC CPU2006 benchmark descriptions. SIGARCH Comput Archit News 34(4):1–17MathSciNetCrossRef

14.

Hsu CH, Feng W (2005) A power-aware run-time system for high-performance computing. In Supercomputing. In: Proceedings of the ACM/IEEE SC 2005 Conference, pp 1, Nov. 2005

15.

Huang S, Feng W (2009) Energy-efficient cluster computing via accurate workload characterization. In: Cluster Computing and the Grid, 2009. CCGRID’09. 9th IEEE/ACM International Symposium on, pp 68–75, May 2009

16.

Iancu C, Hofmeyr S, Blagojevic F, Zheng Y (2010) Oversubscription on multicore processors. In: Parallel Distributed Processing (IPDPS), 2010 IEEE International Symposium on, pp 1–11

17.

Intel 64 and IA-32 architectures software developer’s manual combined volumes 3A, 3B, and 3C: System programming guide. http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf

18.

Ioannou N, Kauschke M, Gries M, Cintra M (2011) Phase-based application-driven hierarchical power management on the single-chip cloud computer. In: Parallel Architectures and Compilation Techniques (PACT), 2011 International Conference on, pp 131–142, Oct. 2011

19.

Kandalla K, Mancini EP, Sur S, Panda DK (2010) Designing power-aware collective communication algorithms for InfiniBand clusters. In: Parallel Processing (ICPP), 2010 39th International Conference on, pp 218–227

20.

Lefurgy C, Rajamani K, Rawson F, Felter W, Kistler M, Keller TW (2003) Energy management for commercial servers. Computer 36(12):39–48CrossRef

21.

Li Z, Saad Y, Sosonkina M (2003) pARMS: a parallel version of the algebraic recursive multilevel solver. Numer Linear Algebra Appl 10:485–509MathSciNetCrossRefMATH

22.

Lim MY, Freeh VW, Lowenthal DK (2006) Adaptive, transparent frequency and voltage scaling of communication phases in MPI programs. In: Proceedings of the 2006 ACM/IEEE conference on Supercomputing

23.

Mills N, Mills E (2015) Taming the energy use of gaming computers. Energy Efficiency 1–18. doi:10.1007/s12053-015-9371-1

24.

Mittal S (2014) A survey of techniques for improving energy efficiency in embedded computing systems. Int J Comput Aided Eng Technol (IJACET) 6:440–459CrossRef

25.

Moscibroda T, Mutlu O (2007) Memory performance attacks: Denial of memory service in multi-core systems. In: Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium, SS’07, pp 18:1–18:18, Berkeley, CA, USA, 2007. USENIX Association

26.

Park J, Shin D, Chang N, Pedram M (2010) Accurate modeling and calculation of delay and energy overheads of dynamic voltage scaling in modern high-performance microprocessors. In: 2010 International Symposium on Low-Power Electronics and Design (ISLPED), pp 419–424

27.

Rountree B, Lownenthal DK, de Supinski BR, Schulz M, Freeh VW, Bletsch T (2009) Adagio: making DVS practical for complex HPC applications. In: Proceedings of the 23rd international conference on Supercomputing, ICS’09, pp 460–469, New York, NY, USA, 2009. ACM

28.

Saad Y (2003) Iterative methods for sparse linear systems, 2nd edn. SIAM, PhiladelphaCrossRefMATH

29.

Sosonkina M, Saad Y, Cai X (2004) Using the parallel algebraic recursive multilevel solver in modern physical applications. Future Gener Comput Syst 20:489–500CrossRef

30.

Sundriyal V, Sosonkina M (2011) Per-call energy saving strategies in all-to-all communications. In: Proceedings of the 18th European MPI Users’ Group conference on Recent advances in the message passing interface, EuroMPI’11, pp 188–197, Berlin, Heidelberg, 2011. Springer-Verlag

31.

Sundriyal V, Sosonkina M (2013) Initial investigation of a scheme to use instantaneous CPU power consumption for energy savings format. In: Proceedings of the 1st International Workshop on Energy Efficient Supercomputing, E2SC ’13, pp 1:1–1:6, New York, NY, USA, 2013. ACM

32.

Sundriyal V, Sosonkina M, Gaenko A (2012) Runtime procedure for energy savings in applications with point-to-point communications. In: Computer Architecture and High Performance Computing (SBAC-PAD), 2012 IEEE 24th International Symposium on, pp 155–162

33.

Sundriyal V, Sosonkina M, Zhang Z (2012) Achieving energy efficiency during collective communications. Pract Exp Concurr Comput 25:2140–2156CrossRef

34.

Tiwari A., Schulz M, Arrington L (2015) Predicting optimal power allocation for CPU and DRAM domains. In: Parallel and Distributed Processing Symposium Workshop (IPDPSW), 2015 IEEE International, pp 951–959, May 2015

35.

Vishnu A, Song S, Marquez A, Barker K, Kerbyson D, Cameron K, Balaji P (2010) Designing energy efficient communication runtime systems for data centric programming models. In: Proceedings of the 2010 IEEE/ACM Int’l Conference on Green Computing and Communications & Int’l Conference on Cyber, Physical and Social Computing, GREENCOM-CPSCOM ’10, pp 229–236, Washington, DC, USA, 2010. IEEE Computer Society

36.

Zhang Z, Chang JM (2014) A cool scheduler for multi-core systems exploiting program phases. Comput IEEE Trans 63(5):1061–1073MathSciNetCrossRef

Title: Joint frequency scaling of processor and DRAM
Authors: Vaibhav Sundriyal
Masha Sosonkina
Publication date: 01-04-2016
Publisher: Springer US
Published in: The Journal of Supercomputing / Issue 4/2016
Print ISSN: 0920-8542
Electronic ISSN: 1573-0484
DOI: https://doi.org/10.1007/s11227-016-1680-4

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Other articles of this Issue 4/2016

Dynamic erasure coding decision for modern block-oriented distributed storage systems

DRDDR: a lightweight method to detect data races in Linux kernel

Handling big data: research challenges and future directions

Automatic parallelization of XQuery programs on multi-core systems

A self-organized volunteer Cloud for e-Science

Soft error resilience in Big Data kernels through modular analysis

Premium Partner