Skip to main content
Top
Published in: Cluster Computing 5/2023

14-10-2022

Reinforcement learning based energy efficient resource allocation strategy of MapReduce jobs with deadline constraint

Author: Greeshma Lingam

Published in: Cluster Computing | Issue 5/2023

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Big Data applications require more energy consumption to process a massive volume of data in a heterogeneous environment. Moreover, reducing energy consumption in Big Data applications is an important research topic. It is one of the challenging issues to conserve energy with a deadline constraint in a heterogeneous environment. In this paper, we formulate scheduling the MapReduce jobs as a minimization problem by considering the decision variables with a user-specified deadline constraint. Further, a Learning Automata-based MapReduce Scheduling (LA-MRS) algorithm has been proposed to identify the resource allocation and save energy consumption of MapReduce tasks in a heterogeneous environment. We perform experimentation on the proposed LA-MRS algorithm using Hibench benchmark workloads such as Enhanced DFSIO, Nutch Indexing, k-mean Clustering and Hive Join. The experimentation illustrates that the proposed LA-MRS algorithm schedules the MapReduce task by saving around 25% of less energy consumed when compared to the existing algorithms.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Shao, Y., Li, C., Gu, J., Zhang, J., Luo, Y.: Efficient jobs scheduling approach for big data applications. Comput. Ind. Eng. 117, 249–261 (2018)CrossRef Shao, Y., Li, C., Gu, J., Zhang, J., Luo, Y.: Efficient jobs scheduling approach for big data applications. Comput. Ind. Eng. 117, 249–261 (2018)CrossRef
2.
go back to reference Li, H., Wang, H., Xiong, A., Lai, J., Tian, W.: Comparative analysis of energy-efficient scheduling algorithms for big data applications. IEEE Access 6, 40073–40084 (2018)CrossRef Li, H., Wang, H., Xiong, A., Lai, J., Tian, W.: Comparative analysis of energy-efficient scheduling algorithms for big data applications. IEEE Access 6, 40073–40084 (2018)CrossRef
3.
go back to reference Yousefi, M.H.N., Goudarzi, M.: A task-based greedy scheduling algorithm for minimizing energy of mapreduce jobs. J. Grid Comput. 16(4), 535–551 (2018)CrossRef Yousefi, M.H.N., Goudarzi, M.: A task-based greedy scheduling algorithm for minimizing energy of mapreduce jobs. J. Grid Comput. 16(4), 535–551 (2018)CrossRef
4.
go back to reference Pandey, V., Saini, P.: A heuristic method towards deadline-aware energy-efficient mapreduce scheduling problem in hadoop yarn. Clust. Comput. 24(2), 683–699 (2021)CrossRef Pandey, V., Saini, P.: A heuristic method towards deadline-aware energy-efficient mapreduce scheduling problem in hadoop yarn. Clust. Comput. 24(2), 683–699 (2021)CrossRef
5.
go back to reference Gregory, A., Majumdar, S.: Resource management for deadline constrained mapreduce jobs for minimising energy consumption. Int. J. Big Data Intell. 5(4), 270–287 (2018)CrossRef Gregory, A., Majumdar, S.: Resource management for deadline constrained mapreduce jobs for minimising energy consumption. Int. J. Big Data Intell. 5(4), 270–287 (2018)CrossRef
6.
go back to reference Zong, Z., Ge, R., Gu, Q.: Marcher: a heterogeneous system supporting energy-aware high performance computing and big data analytics. Big Data Res. 8, 27–38 (2017)CrossRef Zong, Z., Ge, R., Gu, Q.: Marcher: a heterogeneous system supporting energy-aware high performance computing and big data analytics. Big Data Res. 8, 27–38 (2017)CrossRef
7.
go back to reference Verma, A., Cherkasova, L., Kumar, V.S., Campbell, R.H.: Deadline-based workload management for mapreduce environments: Pieces of the performance puzzle. In: 2012 IEEE Network Operations and Management Symposium, pp. 900–905. IEEE (2012) Verma, A., Cherkasova, L., Kumar, V.S., Campbell, R.H.: Deadline-based workload management for mapreduce environments: Pieces of the performance puzzle. In: 2012 IEEE Network Operations and Management Symposium, pp. 900–905. IEEE (2012)
8.
go back to reference Bhattacharya, A.A., Culler, D., Friedman, E., Ghodsi, A., Shenker, S., Stoica, I.: Hierarchical scheduling for diverse datacenter workloads. In: Proceedings of the 4th Annual Symposium on Cloud Computing, pp. 1–15 (2013) Bhattacharya, A.A., Culler, D., Friedman, E., Ghodsi, A., Shenker, S., Stoica, I.: Hierarchical scheduling for diverse datacenter workloads. In: Proceedings of the 4th Annual Symposium on Cloud Computing, pp. 1–15 (2013)
9.
go back to reference Zhang, X., Liu, X., Li, W., Zhang, X.: Trade-off between energy consumption and makespan in the mapreduce resource allocation problem. In: International Conference on Artificial Intelligence and Security, pp. 239–250. Springer (2019) Zhang, X., Liu, X., Li, W., Zhang, X.: Trade-off between energy consumption and makespan in the mapreduce resource allocation problem. In: International Conference on Artificial Intelligence and Security, pp. 239–250. Springer (2019)
10.
go back to reference Wang, H., Cao, Y.: An energy efficiency optimization and control model for hadoop clusters. IEEE Access 7, 40534–40549 (2019)CrossRef Wang, H., Cao, Y.: An energy efficiency optimization and control model for hadoop clusters. IEEE Access 7, 40534–40549 (2019)CrossRef
11.
go back to reference Ahmed, N., Barczak, A.L., Susnjak, T., Rashid, M.A.: A comprehensive performance analysis of apache hadoop and apache spark for large scale data sets using hibench. J. Big Data 7(1), 1–18 (2020)CrossRef Ahmed, N., Barczak, A.L., Susnjak, T., Rashid, M.A.: A comprehensive performance analysis of apache hadoop and apache spark for large scale data sets using hibench. J. Big Data 7(1), 1–18 (2020)CrossRef
12.
go back to reference Hadoop, W., Hadoop, T.: The Definitive Guide. O’Reilly Media Inc, Sebastopol, CA (2015) Hadoop, W., Hadoop, T.: The Definitive Guide. O’Reilly Media Inc, Sebastopol, CA (2015)
13.
go back to reference Ullah, I., Khan, M.S., Amir, M., Kim, J., Kim, S.M.: Lstpd: least slack time-based preemptive deadline constraint scheduler for hadoop clusters. IEEE Access 8, 111751–111762 (2020)CrossRef Ullah, I., Khan, M.S., Amir, M., Kim, J., Kim, S.M.: Lstpd: least slack time-based preemptive deadline constraint scheduler for hadoop clusters. IEEE Access 8, 111751–111762 (2020)CrossRef
14.
go back to reference Gandomi, A., Reshadi, M., Movaghar, A., Khademzadeh, A.: Hybsmrp: a hybrid scheduling algorithm in hadoop mapreduce framework. J. Big Data 6(1), 1–16 (2019)CrossRef Gandomi, A., Reshadi, M., Movaghar, A., Khademzadeh, A.: Hybsmrp: a hybrid scheduling algorithm in hadoop mapreduce framework. J. Big Data 6(1), 1–16 (2019)CrossRef
15.
go back to reference Sulaiman, M., Halim, Z., Lebbah, M., Waqas, M., Tu, S.: An evolutionary computing-based efficient hybrid task scheduling approach for heterogeneous computing environment. J. Grid Comput. 19(1), 1–31 (2021)CrossRef Sulaiman, M., Halim, Z., Lebbah, M., Waqas, M., Tu, S.: An evolutionary computing-based efficient hybrid task scheduling approach for heterogeneous computing environment. J. Grid Comput. 19(1), 1–31 (2021)CrossRef
16.
go back to reference Wu, W., Lin, W., Hsu, C.-H., He, L.: Energy-efficient hadoop for big data analytics and computing: a systematic review and research insights. Futur. Gener. Comput. Syst. 86, 1351–1367 (2018)CrossRef Wu, W., Lin, W., Hsu, C.-H., He, L.: Energy-efficient hadoop for big data analytics and computing: a systematic review and research insights. Futur. Gener. Comput. Syst. 86, 1351–1367 (2018)CrossRef
17.
go back to reference Senthilkumar, M., Ilango, P.: Energy aware task scheduling using hybrid firefly-ga in big data. Int. J. Adv. Intell. Paradigms 16(2), 99–112 (2020)CrossRef Senthilkumar, M., Ilango, P.: Energy aware task scheduling using hybrid firefly-ga in big data. Int. J. Adv. Intell. Paradigms 16(2), 99–112 (2020)CrossRef
18.
go back to reference Tran, X.T., Van Do, T., Rotter, C., Hwang, D.: A new data layout scheme for energy-efficient mapreduce processing tasks. J. Grid Comput. 16(2), 285–298 (2018)CrossRef Tran, X.T., Van Do, T., Rotter, C., Hwang, D.: A new data layout scheme for energy-efficient mapreduce processing tasks. J. Grid Comput. 16(2), 285–298 (2018)CrossRef
19.
go back to reference Cai, X., Li, F., Li, P., Ju, L., Jia, Z.: Sla-aware energy-efficient scheduling scheme for hadoop yarn. J. Supercomput. 73(8), 3526–3546 (2017)CrossRef Cai, X., Li, F., Li, P., Ju, L., Jia, Z.: Sla-aware energy-efficient scheduling scheme for hadoop yarn. J. Supercomput. 73(8), 3526–3546 (2017)CrossRef
20.
go back to reference Jin, P., Hao, X., Wang, X., Yue, L.: Energy-efficient task scheduling for cpu-intensive streaming jobs on hadoop. IEEE Trans. Parallel Distrib. Syst. 30(6), 1298–1311 (2018)CrossRef Jin, P., Hao, X., Wang, X., Yue, L.: Energy-efficient task scheduling for cpu-intensive streaming jobs on hadoop. IEEE Trans. Parallel Distrib. Syst. 30(6), 1298–1311 (2018)CrossRef
21.
go back to reference Lingam, G., Rout, R.R., Somayajulu, D., Ghosh, S.K.: Particle swarm optimization on deep reinforcement learning for detecting social spam bots and spam-influential users in twitter network. IEEE Syst. J. 15(2), 2281–2292 (2020)CrossRef Lingam, G., Rout, R.R., Somayajulu, D., Ghosh, S.K.: Particle swarm optimization on deep reinforcement learning for detecting social spam bots and spam-influential users in twitter network. IEEE Syst. J. 15(2), 2281–2292 (2020)CrossRef
22.
go back to reference Huang, S., Huang, J., Dai, J., Xie, T., Huang, B.: The hibench benchmark suite: Characterization of the mapreduce-based data analysis. In: 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010), pp. 41–51 (2010) Huang, S., Huang, J., Dai, J., Xie, T., Huang, B.: The hibench benchmark suite: Characterization of the mapreduce-based data analysis. In: 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010), pp. 41–51 (2010)
23.
go back to reference Pandey, V., Saini, P.: Constraint programming versus heuristic approach to mapreduce scheduling problem in hadoop yarn for energy minimization. J. Supercomput., 1–29 (2021) Pandey, V., Saini, P.: Constraint programming versus heuristic approach to mapreduce scheduling problem in hadoop yarn for energy minimization. J. Supercomput., 1–29 (2021)
24.
go back to reference Seethalakshmi, V., Govindasamy, V., Akila, V.: Real-coded multi-objective genetic algorithm with effective queuing model for efficient job scheduling in heterogeneous hadoop environment. J. King Saud Univ. (2020) Seethalakshmi, V., Govindasamy, V., Akila, V.: Real-coded multi-objective genetic algorithm with effective queuing model for efficient job scheduling in heterogeneous hadoop environment. J. King Saud Univ. (2020)
25.
go back to reference Li, H., Dai, H., Liu, Z., Fu, H., Zou, Y.: Dynamic energy-efficient scheduling for streaming applications in storm. Computing, 1–20 (2021) Li, H., Dai, H., Liu, Z., Fu, H., Zou, Y.: Dynamic energy-efficient scheduling for streaming applications in storm. Computing, 1–20 (2021)
26.
go back to reference Aggarwal, V., Xu, M., Lan, T., Subramaniam, S.: On the optimality of scheduling dependent mapreduce tasks on heterogeneous machines. arXiv:1711.09964 (2017) Aggarwal, V., Xu, M., Lan, T., Subramaniam, S.: On the optimality of scheduling dependent mapreduce tasks on heterogeneous machines. arXiv:​1711.​09964 (2017)
27.
go back to reference Tang, Z., Jiang, L., Zhou, J., Li, K., Li, K.: A self-adaptive scheduling algorithm for reduce start time. Futur. Gener. Comput. Syst. 43, 51–60 (2015)CrossRef Tang, Z., Jiang, L., Zhou, J., Li, K., Li, K.: A self-adaptive scheduling algorithm for reduce start time. Futur. Gener. Comput. Syst. 43, 51–60 (2015)CrossRef
28.
go back to reference Hsu, C.-H., Slagter, K.D., Chung, Y.-C.: Locality and loading aware virtual machine mapping techniques for optimizing communications in mapreduce applications. Futur. Gener. Comput. Syst. 53, 43–54 (2015)CrossRef Hsu, C.-H., Slagter, K.D., Chung, Y.-C.: Locality and loading aware virtual machine mapping techniques for optimizing communications in mapreduce applications. Futur. Gener. Comput. Syst. 53, 43–54 (2015)CrossRef
29.
go back to reference Dong, J., Goebel, R., Hu, J., Lin, G., Su, B.: Minimizing total job completion time in mapreduce scheduling. Comput. Ind. Eng. 158, 107387 (2021)CrossRef Dong, J., Goebel, R., Hu, J., Lin, G., Su, B.: Minimizing total job completion time in mapreduce scheduling. Comput. Ind. Eng. 158, 107387 (2021)CrossRef
30.
go back to reference Maleki, N., Faragardi, H.R., Rahmani, A.M., Conti, M., Lofstead, J.: Tmar: a two-stage mapreduce scheduler for heterogeneous environments. HCIS 10(1), 1–26 (2020) Maleki, N., Faragardi, H.R., Rahmani, A.M., Conti, M., Lofstead, J.: Tmar: a two-stage mapreduce scheduler for heterogeneous environments. HCIS 10(1), 1–26 (2020)
31.
go back to reference Mashayekhy, L., Nejad, M.M., Grosu, D., Zhang, Q., Shi, W.: Energy-aware scheduling of mapreduce jobs for big data applications. IEEE Trans. Parallel Distrib. Syst. 26(10), 2720–2733 (2014)CrossRef Mashayekhy, L., Nejad, M.M., Grosu, D., Zhang, Q., Shi, W.: Energy-aware scheduling of mapreduce jobs for big data applications. IEEE Trans. Parallel Distrib. Syst. 26(10), 2720–2733 (2014)CrossRef
32.
go back to reference Abualigah, L., Diabat, A., Mirjalili, S., Abd Elaziz, M., Gandomi, A.H.: The arithmetic optimization algorithm. Comput. Methods Appl. Mech. Eng. 376, 113609 (2021)MathSciNetCrossRefMATH Abualigah, L., Diabat, A., Mirjalili, S., Abd Elaziz, M., Gandomi, A.H.: The arithmetic optimization algorithm. Comput. Methods Appl. Mech. Eng. 376, 113609 (2021)MathSciNetCrossRefMATH
33.
go back to reference Abualigah, L., Yousri, D., Abd Elaziz, M., Ewees, A.A., Al-Qaness, M.A., Gandomi, A.H.: Aquila optimizer: a novel meta-heuristic optimization algorithm. Comput. Ind. Eng. 157, 107250 (2021)CrossRef Abualigah, L., Yousri, D., Abd Elaziz, M., Ewees, A.A., Al-Qaness, M.A., Gandomi, A.H.: Aquila optimizer: a novel meta-heuristic optimization algorithm. Comput. Ind. Eng. 157, 107250 (2021)CrossRef
34.
go back to reference Abualigah, L., Abd Elaziz, M., Sumari, P., Geem, Z.W., Gandomi, A.H.: Reptile search algorithm (rsa): a nature-inspired meta-heuristic optimizer. Expert Syst. Appl. 191, 116158 (2022)CrossRef Abualigah, L., Abd Elaziz, M., Sumari, P., Geem, Z.W., Gandomi, A.H.: Reptile search algorithm (rsa): a nature-inspired meta-heuristic optimizer. Expert Syst. Appl. 191, 116158 (2022)CrossRef
35.
go back to reference Zhang, D., Yao, L., Chen, K., Wang, S., Chang, X., Liu, Y.: Making sense of spatio-temporal preserving representations for eeg-based human intention recognition. IEEE Trans. Cybernet. 50(7), 3033–3044 (2019)CrossRef Zhang, D., Yao, L., Chen, K., Wang, S., Chang, X., Liu, Y.: Making sense of spatio-temporal preserving representations for eeg-based human intention recognition. IEEE Trans. Cybernet. 50(7), 3033–3044 (2019)CrossRef
36.
go back to reference Luo, M., Chang, X., Nie, L., Yang, Y., Hauptmann, A.G., Zheng, Q.: An adaptive semisupervised feature analysis for video semantic recognition. IEEE Trans. Cybernet. 48(2), 648–660 (2017)CrossRef Luo, M., Chang, X., Nie, L., Yang, Y., Hauptmann, A.G., Zheng, Q.: An adaptive semisupervised feature analysis for video semantic recognition. IEEE Trans. Cybernet. 48(2), 648–660 (2017)CrossRef
37.
go back to reference Chen, K., Yao, L., Zhang, D., Wang, X., Chang, X., Nie, F.: A semisupervised recurrent convolutional attention model for human activity recognition. IEEE Trans. Neural Netw. Learn. Syst. 31(5), 1747–1756 (2019)CrossRef Chen, K., Yao, L., Zhang, D., Wang, X., Chang, X., Nie, F.: A semisupervised recurrent convolutional attention model for human activity recognition. IEEE Trans. Neural Netw. Learn. Syst. 31(5), 1747–1756 (2019)CrossRef
38.
go back to reference Gao, Y., Huang, C.: Energy-efficient scheduling of mapreduce tasks based on load balancing and deadline constraint in heterogeneous hadoop yarn cluster. In: 2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD), pp. 220–225. IEEE (2021) Gao, Y., Huang, C.: Energy-efficient scheduling of mapreduce tasks based on load balancing and deadline constraint in heterogeneous hadoop yarn cluster. In: 2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD), pp. 220–225. IEEE (2021)
39.
go back to reference Hu, J.: Hybrid dynamic scheduling of mapreduce and spark services based on the profit model in the cloud computing platform. In: 2021 Second International Conference on Intelligent Data Science Technologies and Applications (IDSTA), pp. 114–121. IEEE (2021) Hu, J.: Hybrid dynamic scheduling of mapreduce and spark services based on the profit model in the cloud computing platform. In: 2021 Second International Conference on Intelligent Data Science Technologies and Applications (IDSTA), pp. 114–121. IEEE (2021)
40.
go back to reference Gao, Y., Zhang, K.: Deadline-aware preemptive job scheduling in hadoop yarn clusters. In: 2022 IEEE 25th International Conference on Computer Supported Cooperative Work in Design (CSCWD), pp. 1269–1274. IEEE (2022) Gao, Y., Zhang, K.: Deadline-aware preemptive job scheduling in hadoop yarn clusters. In: 2022 IEEE 25th International Conference on Computer Supported Cooperative Work in Design (CSCWD), pp. 1269–1274. IEEE (2022)
Metadata
Title
Reinforcement learning based energy efficient resource allocation strategy of MapReduce jobs with deadline constraint
Author
Greeshma Lingam
Publication date
14-10-2022
Publisher
Springer US
Published in
Cluster Computing / Issue 5/2023
Print ISSN: 1386-7857
Electronic ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-022-03761-6

Other articles of this Issue 5/2023

Cluster Computing 5/2023 Go to the issue

Premium Partner