Skip to main content
Erschienen in: The Journal of Supercomputing 8/2017

08.02.2016

SLA-aware energy-efficient scheduling scheme for Hadoop YARN

verfasst von: Xiaojun Cai, Feng Li, Ping Li, Lei Ju, Zhiping Jia

Erschienen in: The Journal of Supercomputing | Ausgabe 8/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Apache Hadoop becomes ubiquitous for cloud computing which provides resources as services for multi-tenant applications. YARN (a.k.a. MapReduce 2.0) is one of the key features in the second-generation Hadoop, which provides resource management and scheduling for large-scale MapReduce environments. Two enormous challenges in the YARN scheduler are the abilities to automatically tailor and control resource allocations to different jobs for achieving their Service Level Agreements (SLAs), and minimize energy consumption of the overall cloud computing system. In this work, we propose an SLA-aware energy-efficient scheduling scheme which allocates appropriate amount of resources to MapReduce applications with YARN architecture. In our task scheduling policy, We consider the data locality information to save the MapReduce network traffic. Furthermore, the slack time between the actual execution time of completed tasks and expected completion time of the application is utilized to improve the energy-efficiency of the system. An online userspace governor-based dynamic voltage and frequency scaling (DVFS) scheme is designed in the YARN per-application ApplicationMaster to dynamically change the CPU frequency for upcoming tasks given the slack time from previous completed tasks. Experimental evaluation shows that our proposed scheme outperforms the existing MapReduce scheduling policies in terms of both resource ultization and energy-efficiency.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113CrossRef Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113CrossRef
3.
Zurück zum Zitat Vavilapalli VK et al (2013) Apache hadoop yarn: yet another resource negotiator. In: Proceedings of the 4th annual Symposium on Cloud Computing. ACM, p 5 Vavilapalli VK et al (2013) Apache hadoop yarn: yet another resource negotiator. In: Proceedings of the 4th annual Symposium on Cloud Computing. ACM, p 5
4.
Zurück zum Zitat Van Heddeghem W et al (2014) Trends in worldwide ICT electricity consumption from 2007 to 2012. Comput Commun 50:64–76CrossRef Van Heddeghem W et al (2014) Trends in worldwide ICT electricity consumption from 2007 to 2012. Comput Commun 50:64–76CrossRef
7.
Zurück zum Zitat Ibrahim S et al (2014) Towards efficient power management in MapReduce: investigation of CPU-frequencies scaling on power efficiency in Hadoop. In: Adaptive resource management and scheduling for cloud computing. Springer, pp 147–164 Ibrahim S et al (2014) Towards efficient power management in MapReduce: investigation of CPU-frequencies scaling on power efficiency in Hadoop. In: Adaptive resource management and scheduling for cloud computing. Springer, pp 147–164
8.
Zurück zum Zitat Verma A, Cherkasova L, Campbell RH (2011) ARIA: automatic resource inference and allocation for mapreduce environments. In: Proceedings of the 8th ACM international conference on Autonomic computing. ACM, pp 235–244 Verma A, Cherkasova L, Campbell RH (2011) ARIA: automatic resource inference and allocation for mapreduce environments. In: Proceedings of the 8th ACM international conference on Autonomic computing. ACM, pp 235–244
9.
Zurück zum Zitat Calheiros RN, Ranjan R, Beloglazov A, De Rose CA, Buyya R (2011) Cloudsim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Softw Pract Exp 41(1):23–50CrossRef Calheiros RN, Ranjan R, Beloglazov A, De Rose CA, Buyya R (2011) Cloudsim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Softw Pract Exp 41(1):23–50CrossRef
10.
Zurück zum Zitat Polo J, et al (2010) Performance-driven task co-scheduling for mapreduce environments. In: Network operations and management symposium (NOMS). IEEE, pp 373–380 Polo J, et al (2010) Performance-driven task co-scheduling for mapreduce environments. In: Network operations and management symposium (NOMS). IEEE, pp 373–380
11.
Zurück zum Zitat Ferguson AD, Bodik P, Kandula S, Boutin E, Fonseca R (2012) Jockey: guaranteed job latency in data parallel clusters. In: Proceedings of the 7th ACM european conference on Computer Systems. ACM, pp 99–112 Ferguson AD, Bodik P, Kandula S, Boutin E, Fonseca R (2012) Jockey: guaranteed job latency in data parallel clusters. In: Proceedings of the 7th ACM european conference on Computer Systems. ACM, pp 99–112
12.
Zurück zum Zitat Yao Y, Wang J, Sheng B, Lin J, Mi N (2014) Haste: Hadoop yarn scheduling based on task-dependency and resource-demand. In: IEEE 7th International Conference on Cloud Computing (CLOUD). IEEE, pp 184–191 Yao Y, Wang J, Sheng B, Lin J, Mi N (2014) Haste: Hadoop yarn scheduling based on task-dependency and resource-demand. In: IEEE 7th International Conference on Cloud Computing (CLOUD). IEEE, pp 184–191
13.
Zurück zum Zitat Davis RI, Burns A (2011) A survey of hard real-time scheduling for multiprocessor systems. ACM Comput Surv (CSUR) 43(4):35CrossRefMATH Davis RI, Burns A (2011) A survey of hard real-time scheduling for multiprocessor systems. ACM Comput Surv (CSUR) 43(4):35CrossRefMATH
14.
Zurück zum Zitat Qiu M, Sha EH-M (2009) Cost minimization while satisfying hard/soft timing constraints for heterogeneous embedded systems. ACM Trans Des Autom Electron Syst (TODAES) 14(2):25 Qiu M, Sha EH-M (2009) Cost minimization while satisfying hard/soft timing constraints for heterogeneous embedded systems. ACM Trans Des Autom Electron Syst (TODAES) 14(2):25
15.
Zurück zum Zitat Li J, Ming Z, Qiu M, Quan G, Qin X, Chen T (2011) Resource allocation robustness in multi-core embedded systems with inaccurate information. J Syst Archit 57(9):840–849CrossRef Li J, Ming Z, Qiu M, Quan G, Qin X, Chen T (2011) Resource allocation robustness in multi-core embedded systems with inaccurate information. J Syst Archit 57(9):840–849CrossRef
16.
Zurück zum Zitat Krishna CM, Lee Y-H (2000) Voltage-clock-scaling adaptive scheduling techniques for low power in hard real-time systems. In: 19th Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, pp 156–156 Krishna CM, Lee Y-H (2000) Voltage-clock-scaling adaptive scheduling techniques for low power in hard real-time systems. In: 19th Real-Time and Embedded Technology and Applications Symposium (RTAS). IEEE, pp 156–156
17.
Zurück zum Zitat Kim W, Shin D, Yun H-S, Kim J, Min SL (2002) Performance comparison of dynamic voltage scaling algorithms for hard real-time systems. In: Proceedings of the Eighth IEEE Real-Time and Embedded Technology and Applications Symposium. IEEE, pp 219–228 Kim W, Shin D, Yun H-S, Kim J, Min SL (2002) Performance comparison of dynamic voltage scaling algorithms for hard real-time systems. In: Proceedings of the Eighth IEEE Real-Time and Embedded Technology and Applications Symposium. IEEE, pp 219–228
18.
Zurück zum Zitat Ge R et al (2010) Powerpack: energy profiling and analysis of high-performance systems and applications. IEEE Trans Parallel Distrib Syst 21(5):658–671CrossRef Ge R et al (2010) Powerpack: energy profiling and analysis of high-performance systems and applications. IEEE Trans Parallel Distrib Syst 21(5):658–671CrossRef
19.
Zurück zum Zitat Wang L, Von Laszewski G, Dayal J, Wang F (2010) Towards energy aware scheduling for precedence constrained parallel tasks in a cluster with DVFS. In: IEEE/ACM International Conference on Cluster, Cloud and Grid Computing (CCGrid). IEEE, pp 368–377 Wang L, Von Laszewski G, Dayal J, Wang F (2010) Towards energy aware scheduling for precedence constrained parallel tasks in a cluster with DVFS. In: IEEE/ACM International Conference on Cluster, Cloud and Grid Computing (CCGrid). IEEE, pp 368–377
20.
Zurück zum Zitat Wang Y, Liu H, Liu D, Qin Z, Shao Z, Sha EH-M (2011) Overhead-aware energy optimization for real-time streaming applications on multiprocessor system-on-chip. ACM Trans Des Autom Electron Syst (TODAES) 16(2):14 Wang Y, Liu H, Liu D, Qin Z, Shao Z, Sha EH-M (2011) Overhead-aware energy optimization for real-time streaming applications on multiprocessor system-on-chip. ACM Trans Des Autom Electron Syst (TODAES) 16(2):14
21.
Zurück zum Zitat Wirtz T, Ge R (2011) Improving mapreduce energy efficiency for computation intensive workloads. In: 2011 International Green Computing Conference and Workshops (IGCC). IEEE, pp 1–8 Wirtz T, Ge R (2011) Improving mapreduce energy efficiency for computation intensive workloads. In: 2011 International Green Computing Conference and Workshops (IGCC). IEEE, pp 1–8
22.
Zurück zum Zitat Ge R, Feng X, Feng W-C, Cameron KW (2007) Cpu miser: A performance-directed, run-time system for power-aware clusters. In: International Conference on Parallel Processing (ICPP). IEEE, pp 18–18 Ge R, Feng X, Feng W-C, Cameron KW (2007) Cpu miser: A performance-directed, run-time system for power-aware clusters. In: International Conference on Parallel Processing (ICPP). IEEE, pp 18–18
23.
Zurück zum Zitat Kim W, Gupta MS, Wei G-Y, Brooks D (2008) System level analysis of fast, per-core DVFS using on-chip switching regulators. In: IEEE 14th International Symposium on High Performance Computer Architecture. IEEE, pp 123–134 Kim W, Gupta MS, Wei G-Y, Brooks D (2008) System level analysis of fast, per-core DVFS using on-chip switching regulators. In: IEEE 14th International Symposium on High Performance Computer Architecture. IEEE, pp 123–134
24.
Zurück zum Zitat Maheshwari N, Nanduri R, Varma V (2012) Dynamic energy efficient data placement and cluster reconfiguration algorithm for MapReduce framework. Future Gener Comput Syst 28(1):119–127CrossRef Maheshwari N, Nanduri R, Varma V (2012) Dynamic energy efficient data placement and cluster reconfiguration algorithm for MapReduce framework. Future Gener Comput Syst 28(1):119–127CrossRef
25.
Zurück zum Zitat Cardosa M, Singh A, Pucha H, Chandra A (2012) Exploiting spatio-temporal tradeoffs for energy-aware mapreduce in the cloud. IEEE Trans Comput 61(12):1737–1751MathSciNetCrossRefMATH Cardosa M, Singh A, Pucha H, Chandra A (2012) Exploiting spatio-temporal tradeoffs for energy-aware mapreduce in the cloud. IEEE Trans Comput 61(12):1737–1751MathSciNetCrossRefMATH
26.
Zurück zum Zitat Beloglazov A, Buyya R (2012) Optimal online deterministic algorithms and adaptive heuristics for energy and performance efficient dynamic consolidation of virtual machines in cloud data centers. Concurr Comput Pract Exp 24(13):1397–1420CrossRef Beloglazov A, Buyya R (2012) Optimal online deterministic algorithms and adaptive heuristics for energy and performance efficient dynamic consolidation of virtual machines in cloud data centers. Concurr Comput Pract Exp 24(13):1397–1420CrossRef
27.
Zurück zum Zitat Babu S (2010) Towards automatic optimization of MapReduce programs. In: Proceedings of the 1st ACM symposium on Cloud computing. ACM, pp 137–142 Babu S (2010) Towards automatic optimization of MapReduce programs. In: Proceedings of the 1st ACM symposium on Cloud computing. ACM, pp 137–142
28.
Zurück zum Zitat Belalem G, Tayeb FZ, Zaoui W (2010) Approaches to improve the resources management in the simulator CloudSim. In: Information computing and applications. Springer, pp 189–196 Belalem G, Tayeb FZ, Zaoui W (2010) Approaches to improve the resources management in the simulator CloudSim. In: Information computing and applications. Springer, pp 189–196
29.
Zurück zum Zitat Singleton LC, Poellabauer C, Schwan K (2005) Monitoring of cache miss rates for accurate dynamic voltage and frequency scaling. In: Electronic imaging 2005. International Society for Optics and Photonics, pp 121–125 Singleton LC, Poellabauer C, Schwan K (2005) Monitoring of cache miss rates for accurate dynamic voltage and frequency scaling. In: Electronic imaging 2005. International Society for Optics and Photonics, pp 121–125
31.
Zurück zum Zitat Hammoud M, Rehman MS, Sakr MF (2012) Center-of-gravity reduce task scheduling to lower mapreduce network traffic. In: International Conference on Cloud Computing (CLOUD). IEEE, pp 49–58 Hammoud M, Rehman MS, Sakr MF (2012) Center-of-gravity reduce task scheduling to lower mapreduce network traffic. In: International Conference on Cloud Computing (CLOUD). IEEE, pp 49–58
32.
Zurück zum Zitat Kc K, Anyanwu K (2010) Scheduling hadoop jobs to meet deadlines. In: 2010 IEEE Second International Conference on Cloud Computing Technology and Science. IEEE, pp 388–392 Kc K, Anyanwu K (2010) Scheduling hadoop jobs to meet deadlines. In: 2010 IEEE Second International Conference on Cloud Computing Technology and Science. IEEE, pp 388–392
33.
Zurück zum Zitat He C, Lu Y, Swanson D (2013) Real-time scheduling in mapreduce clusters. In: High performance computing and communications and embedded and ubiquitous computing (\(HPCC\_EUC\)). IEEE, pp 1536–1544 He C, Lu Y, Swanson D (2013) Real-time scheduling in mapreduce clusters. In: High performance computing and communications and embedded and ubiquitous computing (\(HPCC\_EUC\)). IEEE, pp 1536–1544
34.
Zurück zum Zitat Jung J, Kim H (2012) MR-CloudSim: Designing and implementing MapReduce computing model on CloudSim. In: 2012 International Conference on ICT Convergence (ICTC). IEEE, pp 504–509 Jung J, Kim H (2012) MR-CloudSim: Designing and implementing MapReduce computing model on CloudSim. In: 2012 International Conference on ICT Convergence (ICTC). IEEE, pp 504–509
35.
Zurück zum Zitat Minas L, Ellison B (2009) Energy efficiency for information technology: how to reduce power consumption in servers and data centers. Intel Press Minas L, Ellison B (2009) Energy efficiency for information technology: how to reduce power consumption in servers and data centers. Intel Press
Metadaten
Titel
SLA-aware energy-efficient scheduling scheme for Hadoop YARN
verfasst von
Xiaojun Cai
Feng Li
Ping Li
Lei Ju
Zhiping Jia
Publikationsdatum
08.02.2016
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 8/2017
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-016-1653-7

Weitere Artikel der Ausgabe 8/2017

The Journal of Supercomputing 8/2017 Zur Ausgabe