nach oben

The Journal of Supercomputing

Erschienen in:

02.01.2021

PEPS: predictive energy-efficient parallel scheduler for multi-core processors

verfasst von: Zeinab Maghsoud, Hamid Noori, Saadat Pour Mozaffari

Erschienen in: The Journal of Supercomputing | Ausgabe 7/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In multi-core processors, energy efficiency and performance consideration are essential issues. Usually, energy-saving techniques result in performance loss and vice versa. Therefore, energy delay product (EDP) is used broadly in many applications as a trade-off between energy saving and performance improvement. This paper presents a technique to perform work-stealing scheduling in the operating system kernel without needing any modification to the user-space program. The proposed scheduling uses predictive models to determine the optimal active number of cores and clock frequency of the processor as an optimum configuration at runtime for any running program to achieve the minimum EDP value. Since EDP is considered as a long-term metric, at runtime, in each specific time frame, PEPS uses the instruction per watt (IPW) to determine the best configuration. By using performance and power predicting models, PEPS finds the optimal configuration in terms of energy efficiency for the next time interval. Because different workloads at runtime have different behaviors and programs with different degrees of parallelization acted variously, the proposed method uses performance counters as a factor for workload characterization. Compared to the Linux scheduler, the proposed algorithm has up to 25% improvement in energy saving at the cost of 7% performance loss. Moreover, while reducing the temperature by 24%, it results in 19% improvement in EDP.

Vorheriger Artikel A scalable array storage for efficient maintenance of future data

Nächster Artikel Data congestion in VANETs: research directions and new trends through a bibliometric analysis

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Hennessy J, Patterson D (2006) Computer architecture: a quantitative approach, vol 4. Morgan Kaufman, San FranciscoMATH

Moore GE (1965) Cramming more components onto integrated circuits. Electronics 38:114–117

Blumofe RD (1995) Executing multithreaded programs efficiently. Ph.D. thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology

Gautier T, Besseron X, Pigeon L (2007). Kaapi: a thread scheduling runtime system for data flow computations on cluster of multiprocessors. In: Proceedings of the 2007 International Workshop on Parallel Symbolic Computation. ACM, New York, pp 15–23

Leiserson CE, Charles E (2009) The Cilk++ concurrency platform. In: Proceedings of the 46th Annual Design Automation Conference (DAC09), pp 522–527

Duran A, Corbal J and Ayguad Eduard (2008). Evaluation of OpenMP task scheduling strategies. In: Eigenmann R, de Supinski BR (eds) OpenMP in a New Era of Parallelism. IWOMP. Lecture Notes in Computer Science, vol 5004. Springer, Berlin

Charles P, Grothoff C, Saraswat V, Donawa C, Kielstra A, Ebcioglu K, von Praun C, Sarkar V (2005) X10: an object-oriented approach to non-uniform cluster computing. In: OOPSLA’05: Proceedings of the 20th Annual ACM SIGPLAN Conference on Object Oriented Programming, Systems, Languages, and Applications, New York, pp 519–538

Horowitz M, Indermaur T, González R (1994) Low-power digital design. In: Proceedings of 1994 IEEE Symposium on Low Power Electronics, pp 8–11

Sergey Z, Carlos SJ, Sergey B, Alexandra F, Manuel P (2013) Survey of energy-cognizant scheduling techniques. IEEE Trans Parallel Distrib Syst 24:1447–1464CrossRef

10.

Shinde J, Salankar SS (2011) Clock gating—a power optimizing technique for VLSI circuits. In: 2011 Annual IEEE India Conference, IEEE

11.

Nandita S, Prakash NS, Shalakha D, Sivaranjani D (2015) Power Reduction by clock gating technique. Procedia Technol 21:631–635CrossRef

12.

Chien TH, Chang RG (2015) Dynamic voltage and frequency scaling optimization for multi-core architectures. In: Intelligent Systems and Applications: Proceedings of the International

13.

Donald J, Martonosi M (2006) Techniques for multi-core thermal management: classification and new exploration. ACM SIGARCH Comput Archit News 34:2CrossRef

14.

Zanini F, Atienza D, Benini L, Micheli G (2009) Multi-core thermal management with model predictive control. In: European Conference Circuit Theory and Design (ECCTD), vol 1, pp 711–714

15.

Wang Y, Ma K, Wang X (2009) Temperature-constrained power control for chip multiprocessors with online model estimation. In: Proceedings of the 36th Annual International Symposium on Computer Architecture, pp 314–324

16.

Cui Y, Zhang W, He B (2017) A variation-aware adaptive fuzzy control system for thermal management of microprocessors. IEEE Trans Large Scale Integr (VLSI) Syst 25:683–695CrossRef

17.

Alrabea A, Alzubi OA, Alzubi JA (2020) A task-based model for minimizing energy consumption in WSNs. Energy Syst 29:1423–1431

18.

Lawler EL, Labetoulle J (1978) On preemptive scheduling of unrelated parallel processors by linear programming. J ACM (JACM) 25:612–619MathSciNetCrossRef

19.

Bailis P, Reddi VJ, Gandhi S, Brooks D, Seltzer M (2011) Dimetrodon: processor-level preventive thermal management via idle cycle injection. In: IEEE 48th ACM/EDAC/IEEE Design Automation Conference (DAC), New York, USA

20.

Chadha G, Mahlke S, Narayanasamy S (2012) When less is more (LIMO): controlled parallelism for improved efficiency. In: Proceedings of the 2012 International Conference on Compilers, Architectures and Synthesis for Embedded Systems, 2012. CASES, pp 141–150

21.

Charr JC, Couturier R, Fanfakh A, Giersch A (2014) Dynamic frequency scaling for energy consumption reduction in synchronous distributed applications. In: IEEE International Symposium on Parallel and Distributed Processing with Applications

22.

Chien TH, Chang RG (2015) Dynamic voltage and frequency scaling optimization for multi-core architectures. In: Intelligent System and Applications, 2015

23.

Chen Q, Guo M (2018) Contention and locality-aware work-stealing for iterative applications in multi-socket computers. IEEE Trans Comput 67:784–798MathSciNetCrossRef

24.

Cochran R, Hankendi C, Coskun A, Reda S (2011) Identifying the optimal energy-efficient operating points of parallel workloads. In: IEEE/ACM International Conference on Computer-Aided Design (ICCAD)

25.

Ju T et al (2016) Thread count prediction model: dynamically adjusting threads for heterogeneous many-core systems. In: IEEE 21st International Conference on Parallel and Distributed Systems (ICPADS)

26.

Wang W, Davidson JW, Soffa ML (2016) Predicting the memory bandwidth and optimal core allocations for multi-threaded applications on large-scale NUMA machines. In: IEEE International Symposium on High Performance Computer Architecture (HPCA), Barcelona, Spain

27.

De Daniele S, Torquati M, Danelutto M (2016) A reconfiguration algorithm for power-aware parallel applications. ACM Trans Archit Code Optim 43:1–25

28.

Silva VRG, Furtunato A, Georgiou K, Eder K, Xavier-de-Souza S (2018) Energy-optimal configuration for single-node HPC applications. http://arxiv.org/abs/1805.00998

29.

Blumofe RD, Leiserson CE, Santa Fe (1995) Scheduling multithreaded computations by work stealing. In: Proceedings of the 35th Annual Symposium on Foundations of Computer Science, vol 46. Journal of the ACM, New Mexico, pp 356–368

30.

Imam S, Sarkar V, Träff J, Hunold S, Versaci F (2015) Load balancing prioritized tasks via work-stealing. In: Euro-Par 2015: Parallel Processing. Lecture notes in Computer Science, vol 9233

31.

Guo Y et al (2010) SLAW: a scalable locality-aware adaptive work-stealing scheduler. In: IEEE International Symposium on Parallel & Distributed Processing (IPDPS), Atlanta, GA, USA, pp 1–12

32.

Liu YD, Binghamton SUNY (2012) Green thieves in work stealing. In: ASPLOS’12 (Provactive Ideas session)

33.

Ribic H, Liu YD (2014) Energy-efficient work-stealing language runtimes. ACM SIGARCH Comput Archit News 4:513–528CrossRef

34.

Shankar S, Lakomski G, Alvarado C, Hay R (2014) Power aware work-stealing in homogeneous multi-core systems. In: FUTURE COMPUTING: the Sixth International Conference on Future Computational Technologies and Applications

35.

Chen Q, Zheng L, Guo M, Phoenix HZ (2014) EEWA: energy-efficient workload-aware task scheduling in multi-core architectures. IEEE, AZ, USA

36.

Quan C, Minyi G (2018) Contention and locality-aware work-stealing for iterative applications in multi-socket computers. IEEE Trans Comput 67:784–798MathSciNetCrossRef

37.

https://github.com/SakalisC/Splash-3/tree/master/codes. Accessed 26 Mar 2020

38.

Al-hayanni MA et al (2020) PARMA: parallelization-aware run-time management for energy-efficient many-core systems. IEEE Trans Comput (Early Access) 69:1507–1518MathSciNetCrossRef

39.

Salami B, Noori H, Naghibzadeh M (2020) Fairness-aware energy efficient scheduling on heterogeneous multi-core processors. IEEE Trans Comput 70:72–82CrossRef

40.

Blumofe RD, Leiserson CE (1994) Scheduling multithreaded computations by work stealing. In: Proceeding of the 35th Annual Symposium on Foundations of Computer Science, Santa Fe, New Mexico, pp 356–368

41.

Bircher WL, John LK, San J (2007) Complete system power estimation: a trickle-down approach based on performance events. In: IEEE International Symposium on Performance Analysis of Systems & Software, CA, USA

42.

Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238CrossRef

43.

Brodowski D, Golde N (2015) CPU frequency and voltage scaling code in the Linux (TM) kernel. Linux CPUFreq. CPUFreq Governors

44.

Kim S-W, Lee JJ-S, Dugar V, De Vega J (2014) Intel® power gadget. Intel Corporation, vol 7

45.

Eranian S (2006) Perfmon2: a flexible performance monitoring interface for Linux. In: Proceeding of the Ottawa Linux Symposium

Titel: PEPS: predictive energy-efficient parallel scheduler for multi-core processors
verfasst von: Zeinab Maghsoud
Hamid Noori
Saadat Pour Mozaffari
Publikationsdatum: 02.01.2021
Verlag: Springer US
Erschienen in: The Journal of Supercomputing / Ausgabe 7/2021
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI: https://doi.org/10.1007/s11227-020-03562-x

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Weitere Artikel der Ausgabe 7/2021

Enhanced path planning for automated nanites drug delivery based on reinforcement learning and polymorphic improved ant colony optimization

Design and implementation of an academic expert system through big data analysis

Reconfigurable edge as a service: enhancing edges using quality-based solutions

CamNav: a computer-vision indoor navigation system

Algorithms for functionalities of virtual network: a survey

On the performance of a GPU-based SoC in a distributed spatial audio system

Premium Partner