Skip to main content
Top
Published in: Cluster Computing 3/2019

10-03-2018

Joint deadline-constrained and influence-aware design for allocating MapReduce jobs in cloud computing systems

Authors: Jenn-Wei Lin, Joseph M. Arul, Chi-Yi Lin

Published in: Cluster Computing | Special Issue 3/2019

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

MapReduce can speed up the execution of jobs operating over big data. A MapReduce job can be divided into a number of map and reduce tasks by a well determined division manner on its processing data. In a cloud computing system, multiple MapReduce jobs may be submitted together to compete for the computing resources of the system. When a job has a particular performance requirement (e.g. execution deadline), the appropriate computing resources must be kept for executing the map/reduce tasks of the job; otherwise, the performance requirement cannot be satisfied. Several deadline-constrained MapReduce schedulers have been proposed, but most of them are not aware of the performance influence over existing tasks. We propose a deadline-constrained and influence-aware MapReduce scheduler which combines the following three factors: (1) relaxed data locality, (2) performance influence over existing tasks, and (3) coordinating allocation contention. We first adopt the data-locality criterion to make a tentative allocation plan. By verifying the data-locality allocation plan, if some new tasks severely affect existing tasks or the deadline requirements of some new tasks are not satisfied, the data-locality allocation plan will be modified by re-allocating some new tasks. To optimize the computing resource usage, the solution of a well-known network graph problem: minimum cost maximum-flow (MCMF) is applied to perform the modification of the data-locality allocation plan. A heuristic algorithm is also presented to suppress the complexity of MCMF problem. In addition to meeting the deadline requirements of new jobs, the final allocation plan also considers the performance influence over existing jobs. Finally, we conduct the performance analysis to demonstrate the performance of our proposed MapReduce scheduler using various performance metrics.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. ACM Commun. 51(1), 107–113 (2008)CrossRef Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. ACM Commun. 51(1), 107–113 (2008)CrossRef
2.
go back to reference Buyya, R., Broberg, J., Goscinski, A.M.: Cloud Computing Principles and Paradigms. Wiley Publishing, Hoboken (2011)CrossRef Buyya, R., Broberg, J., Goscinski, A.M.: Cloud Computing Principles and Paradigms. Wiley Publishing, Hoboken (2011)CrossRef
3.
go back to reference Zhang, B., Krikava, F., Rouvoy, R., Seinturier, L.: Self-configuration of the number of concurrently running MapReduce jobs in a hadoop cluster. In: Proceedings of the IEEE international conference on autonomic computing, pp. 149–150 (2015) Zhang, B., Krikava, F., Rouvoy, R., Seinturier, L.: Self-configuration of the number of concurrently running MapReduce jobs in a hadoop cluster. In: Proceedings of the IEEE international conference on autonomic computing, pp. 149–150 (2015)
4.
go back to reference White, T.: Hadoop: The Definitive Guide, 3rd edn. Inc. O’Reilly Media, Beijing (2012) White, T.: Hadoop: The Definitive Guide, 3rd edn. Inc. O’Reilly Media, Beijing (2012)
5.
go back to reference Zaharia, M., Konwinski, A., Joseph, A.D., Katz, R., Stoica, I.: Improving MapReduce performance in heterogeneous environments. In: Proceedings of USENIX Conference OSDI, pp. 29–42 (2008) Zaharia, M., Konwinski, A., Joseph, A.D., Katz, R., Stoica, I.: Improving MapReduce performance in heterogeneous environments. In: Proceedings of USENIX Conference OSDI, pp. 29–42 (2008)
6.
go back to reference Tang, Z., Zhou, J., Li, K., Li, R.: A MapReduce task scheduling algorithm for deadline constraints. Clust. Comput. 16, 651–662 (2013)CrossRef Tang, Z., Zhou, J., Li, K., Li, R.: A MapReduce task scheduling algorithm for deadline constraints. Clust. Comput. 16, 651–662 (2013)CrossRef
7.
go back to reference Shin, S., Kim, Y., Lee S.: Deadline-guaranteed scheduling algorithm with improved resource utilization for cloud computing. In: 12th annual IEEE consumer communications and networking conference, pp. 814–819 (2015) Shin, S., Kim, Y., Lee S.: Deadline-guaranteed scheduling algorithm with improved resource utilization for cloud computing. In: 12th annual IEEE consumer communications and networking conference, pp. 814–819 (2015)
8.
go back to reference Chen, C.H., Lin, J.W., Kuo, S.Y.: MapReduce scheduling for deadline-constrained jobs in heterogeneous cloud computing systems. In: IEEE transactions on cloud computing, accepted for publication Chen, C.H., Lin, J.W., Kuo, S.Y.: MapReduce scheduling for deadline-constrained jobs in heterogeneous cloud computing systems. In: IEEE transactions on cloud computing, accepted for publication
9.
go back to reference Ahuja, R.K., Magnanti, T.L., Orlin, J.B.: Network Flows: Theory Algorithms and Applications, 1st edn. Prentice Hall, Upper Saddle River (1993)MATH Ahuja, R.K., Magnanti, T.L., Orlin, J.B.: Network Flows: Theory Algorithms and Applications, 1st edn. Prentice Hall, Upper Saddle River (1993)MATH
10.
go back to reference Tiwari, N., Sarkar, S., Bellur, U., Indrawan, M.: Classification framework of MapReduce scheduling algorithms. ACM Comput. Surv. (CSUR) 47(3), 49:1–49:38 (2015)CrossRef Tiwari, N., Sarkar, S., Bellur, U., Indrawan, M.: Classification framework of MapReduce scheduling algorithms. ACM Comput. Surv. (CSUR) 47(3), 49:1–49:38 (2015)CrossRef
12.
go back to reference Ho, L.Y., Wu, J.J., Liu, P.: Optimal algorithms for cross-rack communication optimization in MapReduce framework. In: Proceedings of IEEE CLOUD, pp. 420–427 (2011) Ho, L.Y., Wu, J.J., Liu, P.: Optimal algorithms for cross-rack communication optimization in MapReduce framework. In: Proceedings of IEEE CLOUD, pp. 420–427 (2011)
13.
go back to reference Sokkalingam, P.T., Ahuja, R.K., Orlin, J.B.: New polynomial-time cycle-canceling algorithms for minimum-cost flows. Networks 36(1), 53–63 (2000)MathSciNetCrossRef Sokkalingam, P.T., Ahuja, R.K., Orlin, J.B.: New polynomial-time cycle-canceling algorithms for minimum-cost flows. Networks 36(1), 53–63 (2000)MathSciNetCrossRef
14.
go back to reference Xu, C.X.: A simple solution to maximum flow at minimum cost. In: Proceedings of 2010 2nd International Conference Information Engineering and Computer Science (ICIECS 10), pp. 1–4 (2010) Xu, C.X.: A simple solution to maximum flow at minimum cost. In: Proceedings of 2010 2nd International Conference Information Engineering and Computer Science (ICIECS 10), pp. 1–4 (2010)
15.
go back to reference Kelner, J.A., Lee, Y.T., Orecchia, L., Sidford, A.: An almost-linear-time algorithm for approximate max flow in undirected graphs, and its multicommodity generalizations. In: Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms, p. 217–226 (2014) Kelner, J.A., Lee, Y.T., Orecchia, L., Sidford, A.: An almost-linear-time algorithm for approximate max flow in undirected graphs, and its multicommodity generalizations. In: Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms, p. 217–226 (2014)
18.
go back to reference Sarda, K., Sanghrajka, S., Sion, R.: Cloud Performance Benchmark Series: Amazon EC2 CPU Speed Benchmark. Department of Computer Science, Stony Brook University, Tech. Rep. (2010) Sarda, K., Sanghrajka, S., Sion, R.: Cloud Performance Benchmark Series: Amazon EC2 CPU Speed Benchmark. Department of Computer Science, Stony Brook University, Tech. Rep. (2010)
20.
go back to reference Chen, Q., Liu, C., Xiao, Z.: Improving MapReduce performance using smart speculative execution strategy. IEEE Trans. Comput. 63(4), 954–967 (2014)MathSciNetCrossRef Chen, Q., Liu, C., Xiao, Z.: Improving MapReduce performance using smart speculative execution strategy. IEEE Trans. Comput. 63(4), 954–967 (2014)MathSciNetCrossRef
Metadata
Title
Joint deadline-constrained and influence-aware design for allocating MapReduce jobs in cloud computing systems
Authors
Jenn-Wei Lin
Joseph M. Arul
Chi-Yi Lin
Publication date
10-03-2018
Publisher
Springer US
Published in
Cluster Computing / Issue Special Issue 3/2019
Print ISSN: 1386-7857
Electronic ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-018-1981-x

Other articles of this Special Issue 3/2019

Cluster Computing 3/2019 Go to the issue

Premium Partner