Skip to main content
Top
Published in: Cluster Computing 3/2015

01-09-2015

MOMTH: multi-objective scheduling algorithm of many tasks in Hadoop

Authors: Mihaela-Catalina Nita, Florin Pop, Cristiana Voicu, Ciprian Dobre, Fatos Xhafa

Published in: Cluster Computing | Issue 3/2015

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

A real challenge sits in front of the business solutions these days, in the context of the big amount of data generated by complex software applications: efficiently using the given limited resources to accomplish specific operations and tasks. Depending on the type of application dealing with, when trying to deliver a certain service in a specific time and with a limited budget, a sequential application may be redesigned in a convenient way so that it will become scalable and able to run on multiple resources. Many task computing model brings together loosely coupled applications, composed of many dependent/independent tasks, which will work together for a common result. When asking for a certain service, the most frequently constraints addressed by the user are deadline and budget. This paper elaborates on a multi-objective scheduling algorithm of many tasks in Hadoop for big data processing, named MOMTH. We consider objective functions related to users and resources in the same time with constraints like deadline (scheduling in due time) and budget. The algorithm evaluation was realized in scheduling load simulator, a tool integrated in Hadoop. MobiWay, a collaboration platform that expose interoperability between a large number of sensing mobile devices and a wide-range of mobility applications, was chosen for performance analysis of MOMTH. We compared the proposed algorithm with first in first out and fair schedulers and we obtained similar performance for our approach.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
Thanks to The scheduling zoo: A searchable bibliography on scheduling by Peter Brucker and Sigrid Knust, http://​www-desir.​lip6.​fr/​~durrc/​query/​.
 
Literature
1.
go back to reference Abrishami, S., Naghibzadeh, M., Dick, H.J.: Epema. Deadline-constrained workflow scheduling algorithms for infrastructure as a service clouds. Future Gener. Comput. Syst. 29(1):158–169 (2013). Including Special section: AIRCC-NetCoM 2009 and Special section: Clouds and Service-Oriented Architectures Abrishami, S., Naghibzadeh, M., Dick, H.J.: Epema. Deadline-constrained workflow scheduling algorithms for infrastructure as a service clouds. Future Gener. Comput. Syst. 29(1):158–169 (2013). Including Special section: AIRCC-NetCoM 2009 and Special section: Clouds and Service-Oriented Architectures
2.
go back to reference Baptiste, P., Brucker, P., Knust, S., Timkovsky, V.: Ten notes on equal-execution-time scheduling. 4OR, 2:111–127 (2004) Baptiste, P., Brucker, P., Knust, S., Timkovsky, V.: Ten notes on equal-execution-time scheduling. 4OR, 2:111–127 (2004)
4.
go back to reference Baptiste, P.: A note on scheduling multiprocessor tasks with identical processing times. Comput. Oper. Res. 30(13), 2071–2078 (2003)MathSciNetCrossRefMATH Baptiste, P.: A note on scheduling multiprocessor tasks with identical processing times. Comput. Oper. Res. 30(13), 2071–2078 (2003)MathSciNetCrossRefMATH
5.
go back to reference Bart, I.L.: Urban sprawl and climate change: a statistical exploration of cause and effect, with policy options for the EU. Land Use Policy 27(2), 283–292 (2010). Forest transitions Wind power planning, landscapes and publicsCrossRef Bart, I.L.: Urban sprawl and climate change: a statistical exploration of cause and effect, with policy options for the EU. Land Use Policy 27(2), 283–292 (2010). Forest transitions Wind power planning, landscapes and publicsCrossRef
6.
go back to reference Bessis, N., Sotiriadis, S., Pop, F., Cristea, V.: Optimizing the energy efficiency of message exchanging for service distribution in interoperable infrastructures. In: 2012 4th International Conference on Intelligent Networking and Collaborative Systems (INCoS), pp. 105–112 Sept 2012 Bessis, N., Sotiriadis, S., Pop, F., Cristea, V.: Optimizing the energy efficiency of message exchanging for service distribution in interoperable infrastructures. In: 2012 4th International Conference on Intelligent Networking and Collaborative Systems (INCoS), pp. 105–112 Sept 2012
7.
go back to reference Bessis, N., Sotiriadis, S., Pop, F., Cristea, V.: Using a novel message-exchanging optimization (meo) model to reduce energy consumption in distributed systems. Simul. Model. Pract. Theory 39(0), 104–120 (2013). S.I.Energy efficiency in Grids and CloudsCrossRef Bessis, N., Sotiriadis, S., Pop, F., Cristea, V.: Using a novel message-exchanging optimization (meo) model to reduce energy consumption in distributed systems. Simul. Model. Pract. Theory 39(0), 104–120 (2013). S.I.Energy efficiency in Grids and CloudsCrossRef
8.
go back to reference Błażewicz, J., Liu, Z.: Scheduling multiprocessor tasks with chain constraints. Eur. J. Oper. Res. 94(2), 231–241 (1996)CrossRefMATH Błażewicz, J., Liu, Z.: Scheduling multiprocessor tasks with chain constraints. Eur. J. Oper. Res. 94(2), 231–241 (1996)CrossRefMATH
9.
go back to reference Bourdena, A., Mavromoustakis, C.X., Kormentzas, G., Pallis, E., Mastorakis, G.: A resource intensive traffic-aware scheme using energy-aware routing in cognitive radio networks. Future Gener. Comput. Syst. 39(0), 16–28 (2014). Special Issue on Ubiquitous Computing and Future Communication SystemsCrossRef Bourdena, A., Mavromoustakis, C.X., Kormentzas, G., Pallis, E., Mastorakis, G.: A resource intensive traffic-aware scheme using energy-aware routing in cognitive radio networks. Future Gener. Comput. Syst. 39(0), 16–28 (2014). Special Issue on Ubiquitous Computing and Future Communication SystemsCrossRef
11.
go back to reference Du, J., Leung, J.Y.-T., Young, G.H.: Scheduling chain-structured tasks to minimize makespan and mean flow time. Inf. Comput. 92(2), 219–236 (1991)MathSciNetCrossRefMATH Du, J., Leung, J.Y.-T., Young, G.H.: Scheduling chain-structured tasks to minimize makespan and mean flow time. Inf. Comput. 92(2), 219–236 (1991)MathSciNetCrossRefMATH
12.
go back to reference Dufour, B., Driesen, K., Hendren, L., Verbrugge, C.: Dynamic metrics for java. SIGPLAN Not. 38(11), 149–168 (2003)CrossRef Dufour, B., Driesen, K., Hendren, L., Verbrugge, C.: Dynamic metrics for java. SIGPLAN Not. 38(11), 149–168 (2003)CrossRef
13.
go back to reference Durillo, J.J., Nae, V., Prodan, R.: Multi-objective energy-efficient workflow scheduling using list-based heuristics. Future Gener. Comput. Syst. 36(0):221–236 (2014). Special Section: Intelligent Big Data Processing Special Section: Behavior Data Security Issues in Network Information Propagation Special Section: Energy-efficiency in Large Distributed Computing Architectures Special Section: eScience Infrastructure and Applications Durillo, J.J., Nae, V., Prodan, R.: Multi-objective energy-efficient workflow scheduling using list-based heuristics. Future Gener. Comput. Syst. 36(0):221–236 (2014). Special Section: Intelligent Big Data Processing Special Section: Behavior Data Security Issues in Network Information Propagation Special Section: Energy-efficiency in Large Distributed Computing Architectures Special Section: eScience Infrastructure and Applications
14.
go back to reference EU Parliament. Resolution of 10 september 2013 on promoting a european transport-technology strategy for europe’s future sustainable mobility. http://bit.ly/1vJm2Ho. Oct 2014 EU Parliament. Resolution of 10 september 2013 on promoting a european transport-technology strategy for europe’s future sustainable mobility. http://​bit.​ly/​1vJm2Ho. Oct 2014
16.
go back to reference Fan, Y., Wei, W., Gao, Y., Wu, W.: Introduction and analysis of simulators of mapreduce. Trustworthy Comput. Serv. pp 345–350. Springer, (2014) Fan, Y., Wei, W., Gao, Y., Wu, W.: Introduction and analysis of simulators of mapreduce. Trustworthy Comput. Serv. pp 345–350. Springer, (2014)
17.
go back to reference Garey, M.R., Johnson, D.S.: “Strong” NP-completeness results: motivation, examples, and implications. J. Assoc. Comput. Mach. 25(3), 499–508 (1978)MathSciNetCrossRefMATH Garey, M.R., Johnson, D.S.: “Strong” NP-completeness results: motivation, examples, and implications. J. Assoc. Comput. Mach. 25(3), 499–508 (1978)MathSciNetCrossRefMATH
18.
go back to reference Guo, Z., Fox, G.: Improving mapreduce performance in heterogeneous network environments and resource utilization. In: Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (Ccgrid 2012), CCGRID ’12, pp. 714–716, Washington 2012. IEEE Computer Society Guo, Z., Fox, G.: Improving mapreduce performance in heterogeneous network environments and resource utilization. In: Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (Ccgrid 2012), CCGRID ’12, pp. 714–716, Washington 2012. IEEE Computer Society
19.
go back to reference Guo, L., Zhao, S., Shen, S., Jiang, C.: Task scheduling optimization in cloud computing based on heuristic algorithm. J. Netw. 7(3), 547–553 (2012) Guo, L., Zhao, S., Shen, S., Jiang, C.: Task scheduling optimization in cloud computing based on heuristic algorithm. J. Netw. 7(3), 547–553 (2012)
21.
go back to reference Kc, K., Anyanwu, K.: Scheduling hadoop jobs to meet deadlines. In: Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science, CLOUDCOM ’10, pp. 388–392, Washington, DC, USA, 2010. IEEE Computer Society Kc, K., Anyanwu, K.: Scheduling hadoop jobs to meet deadlines. In: Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science, CLOUDCOM ’10, pp. 388–392, Washington, DC, USA, 2010. IEEE Computer Society
22.
go back to reference Lawler, E.L., Lenstra, J.K., Kan, A.H.G.R., Shmoys, D.B.: Sequencing and Scheduling: Algorithms and Complexity, Volume 4 of Operations Research and Managment Science. CWI, Amsterdam (1989) Lawler, E.L., Lenstra, J.K., Kan, A.H.G.R., Shmoys, D.B.: Sequencing and Scheduling: Algorithms and Complexity, Volume 4 of Operations Research and Managment Science. CWI, Amsterdam (1989)
23.
go back to reference Mavromoustakis, C.X., Dimitriou, C., Mastorakis, G., Bourdena, A., Pallis, E.: Using traffic diversities for scheduling wireless interfaces for energy harvesting in wireless devices. In Resource Management in Mobile Computing Environments, volume 3 of Modeling and Optimization in Science and Technologies, pp 481–496. Springer International Publishing (2014) Mavromoustakis, C.X., Dimitriou, C., Mastorakis, G., Bourdena, A., Pallis, E.: Using traffic diversities for scheduling wireless interfaces for energy harvesting in wireless devices. In Resource Management in Mobile Computing Environments, volume 3 of Modeling and Optimization in Science and Technologies, pp 481–496. Springer International Publishing (2014)
24.
go back to reference Mavromoustakis, C.X., Pallis, E., Mastorakis, G.: Resource Management in Mobile Computing Environments. Springer, Berlin (2014) Mavromoustakis, C.X., Pallis, E., Mastorakis, G.: Resource Management in Mobile Computing Environments. Springer, Berlin (2014)
25.
go back to reference Nguyen, P., Simon, T., Halem, M., Chapman, D., Le, Q.: A hybrid scheduling algorithm for data intensive workloads in a mapreduce environment. In: Proceedings of the 2012 IEEE/ACM Fifth International Conference on Utility and Cloud Computing, UCC ’12, pp. 161–167, Washington, DC, USA, 2012. IEEE Computer Society Nguyen, P., Simon, T., Halem, M., Chapman, D., Le, Q.: A hybrid scheduling algorithm for data intensive workloads in a mapreduce environment. In: Proceedings of the 2012 IEEE/ACM Fifth International Conference on Utility and Cloud Computing, UCC ’12, pp. 161–167, Washington, DC, USA, 2012. IEEE Computer Society
26.
go back to reference Nita, M.-C., Chilipirea, C., Dobre, C., Pop, F.: A sla-based method for big-data transfers with multi-criteria optimization constraints for iaas. In: Roedunet International Conference (RoEduNet), 2013 11th, pp 1–6 (2013) Nita, M.-C., Chilipirea, C., Dobre, C., Pop, F.: A sla-based method for big-data transfers with multi-criteria optimization constraints for iaas. In: Roedunet International Conference (RoEduNet), 2013 11th, pp 1–6 (2013)
27.
go back to reference Nita, M.-C., Pop, F., Cristea, V.: Scheduling service with sla assurance for private cloud systems. In: 2012 IEEE International Conference on Intelligent Computer Communication and Processing (ICCP), pp. 331–334, Aug 2012 Nita, M.-C., Pop, F., Cristea, V.: Scheduling service with sla assurance for private cloud systems. In: 2012 IEEE International Conference on Intelligent Computer Communication and Processing (ICCP), pp. 331–334, Aug 2012
28.
go back to reference Pandey, S., Buyya, R.: Scheduling workflow applications based on multi-source parallel data retrieval in distributed computing networks. Comput. J. 55(11), 1288–1308 (2012)CrossRef Pandey, S., Buyya, R.: Scheduling workflow applications based on multi-source parallel data retrieval in distributed computing networks. Comput. J. 55(11), 1288–1308 (2012)CrossRef
29.
go back to reference Raicu, I., Foster, I.T., Zhao, Y.: Many-task computing for grids and supercomputers. In: Workshop on Many-Task Computing on Grids and Supercomputers, 2008. MTAGS 2008. pp. 1–11 (2008) Raicu, I., Foster, I.T., Zhao, Y.: Many-task computing for grids and supercomputers. In: Workshop on Many-Task Computing on Grids and Supercomputers, 2008. MTAGS 2008. pp. 1–11 (2008)
30.
go back to reference Rong, G., Yang, X., Yan, J., Sun, Y., Wang, B., Yuan, C., Huang, Y.: Shadoop: improving mapreduce performance by optimizing job execution mechanism in hadoop clusters. J. Parallel Distrib. Comput. 74(3), 2166–2179 (2014) Rong, G., Yang, X., Yan, J., Sun, Y., Wang, B., Yuan, C., Huang, Y.: Shadoop: improving mapreduce performance by optimizing job execution mechanism in hadoop clusters. J. Parallel Distrib. Comput. 74(3), 2166–2179 (2014)
31.
go back to reference Simon, T.A., Nguyen, P., Halem, M.: Multiple objective scheduling of hpc workloads through dynamic prioritization. In: Proceedings of the High Performance Computing Symposium, HPC ’13, pp. 13:1–13:8, San Diego, CA, USA, 2013. Society for Computer Simulation International Simon, T.A., Nguyen, P., Halem, M.: Multiple objective scheduling of hpc workloads through dynamic prioritization. In: Proceedings of the High Performance Computing Symposium, HPC ’13, pp. 13:1–13:8, San Diego, CA, USA, 2013. Society for Computer Simulation International
32.
go back to reference Staples, G.: Torque resource manager. In: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, SC ’06, New York, NY, USA, 2006. ACM Staples, G.: Torque resource manager. In: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, SC ’06, New York, NY, USA, 2006. ACM
34.
go back to reference Vavilapalli, V.K., Murthy, A.C., Douglas, C., Agarwal, S., Konar, M., Evans, R., Graves, T., Lowe, J., Shah, H., Seth, S., Saha, B., Curino, C., O’Malley, O., Radia, S., Reed, B., Baldeschwieler, E.: Apache hadoop yarn: Yet another resource negotiator. In: Proceedings of the 4th Annual Symposium on Cloud Computing, SOCC ’13, pp. 5:1–5:16, New York, NY, USA, 2013. ACM Vavilapalli, V.K., Murthy, A.C., Douglas, C., Agarwal, S., Konar, M., Evans, R., Graves, T., Lowe, J., Shah, H., Seth, S., Saha, B., Curino, C., O’Malley, O., Radia, S., Reed, B., Baldeschwieler, E.: Apache hadoop yarn: Yet another resource negotiator. In: Proceedings of the 4th Annual Symposium on Cloud Computing, SOCC ’13, pp. 5:1–5:16, New York, NY, USA, 2013. ACM
35.
go back to reference Voicu, C., Pop, F., Dobre, C., Xhafa, F.: Momc: Multi-objective and multi-constrained scheduling algorithm of many tasks in hadoo. In 3PGCIC-2014, The 9-th International Conference on P2P, Parallel, Grid, Cloud and Internet Computing. IEEE Explore Nov 2014 Voicu, C., Pop, F., Dobre, C., Xhafa, F.: Momc: Multi-objective and multi-constrained scheduling algorithm of many tasks in hadoo. In 3PGCIC-2014, The 9-th International Conference on P2P, Parallel, Grid, Cloud and Internet Computing. IEEE Explore Nov 2014
36.
go back to reference Wang, L., Khan, S.U., Chen, D., Kołodziej, J., Ranjan, R., Xu, C.Z., Zomaya, A.: Energy-aware parallel task scheduling in a cluster. Future Gener. Comput. Syst. 29(7):1661–1670, 2013. Including Special sections: cyber-enabled Distributed Computing for Ubiquitous Cloud and Network Services, Cloud Computing and Scientific Applications—Big Data, Scalable Analytics, and Beyond Wang, L., Khan, S.U., Chen, D., Kołodziej, J., Ranjan, R., Xu, C.Z., Zomaya, A.: Energy-aware parallel task scheduling in a cluster. Future Gener. Comput. Syst. 29(7):1661–1670, 2013. Including Special sections: cyber-enabled Distributed Computing for Ubiquitous Cloud and Network Services, Cloud Computing and Scientific Applications—Big Data, Scalable Analytics, and Beyond
37.
go back to reference Wang, L., von Laszewski, G., Younge, A., He, X., Kunze, M., Tao, J., Cheng, F.: Cloud computing: a perspective study. New Gener. Comput. 28(2), 137–146 (2010)CrossRefMATH Wang, L., von Laszewski, G., Younge, A., He, X., Kunze, M., Tao, J., Cheng, F.: Cloud computing: a perspective study. New Gener. Comput. 28(2), 137–146 (2010)CrossRefMATH
38.
go back to reference Wang, L., Tao, J., Ranjan, R., Marten, H., Streit, A., Chen, J., Chen, D.: G-hadoop: mapreduce across distributed data centers for data-intensive computing. Future Gener. Comput. Syst. 29(3), 739–750 (2013). Special Section: Recent Developments in High Performance Computing and SecurityCrossRef Wang, L., Tao, J., Ranjan, R., Marten, H., Streit, A., Chen, J., Chen, D.: G-hadoop: mapreduce across distributed data centers for data-intensive computing. Future Gener. Comput. Syst. 29(3), 739–750 (2013). Special Section: Recent Developments in High Performance Computing and SecurityCrossRef
39.
go back to reference Xia, Y., Wang, L., Zhao, Q., Zhang, G.: Research on job scheduling algorithm in hadoop. J. Comput. Inf. Syst. 7(16), 5769–5775 (2011) Xia, Y., Wang, L., Zhao, Q., Zhang, G.: Research on job scheduling algorithm in hadoop. J. Comput. Inf. Syst. 7(16), 5769–5775 (2011)
40.
go back to reference Zaharia, M., Konwinski, A., Joseph, A.D., Katz, R., Stoica, I.: Improving mapreduce performance in heterogeneous environments. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, OSDI’08, pp. 29–42, Berkeley, CA, USA, 2008. USENIX Association Zaharia, M., Konwinski, A., Joseph, A.D., Katz, R., Stoica, I.: Improving mapreduce performance in heterogeneous environments. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, OSDI’08, pp. 29–42, Berkeley, CA, USA, 2008. USENIX Association
41.
go back to reference Zhang, F., Cao, J., Li, K., Khan, S.U., Hwang, K.: Multi-objective scheduling of many tasks in cloud platforms. Future Gener. Comput. Syst. 37(0):309–320 (2014). Special Section: Innovative Methods and Algorithms for Advanced Data-Intensive Computing Special Section: Semantics, Intelligent processing and services for big data Special Section: Advances in Data-Intensive Modelling and Simulation Special Section: Hybrid Intelligence for Growing Internet and its Applications Zhang, F., Cao, J., Li, K., Khan, S.U., Hwang, K.: Multi-objective scheduling of many tasks in cloud platforms. Future Gener. Comput. Syst. 37(0):309–320 (2014). Special Section: Innovative Methods and Algorithms for Advanced Data-Intensive Computing Special Section: Semantics, Intelligent processing and services for big data Special Section: Advances in Data-Intensive Modelling and Simulation Special Section: Hybrid Intelligence for Growing Internet and its Applications
42.
go back to reference Zhao, J., Wang, L., Tao, J., Chen, J., Sun, W., Ranjan, R., Kolodziej, J., Streit, A., Georgakopoulos, D.: A security framework in g-hadoop for big data computing across distributed cloud data centres. J. Comput. Syst. Sci. 80(5):994–1007 (2014). cited By (since 1996)0 Zhao, J., Wang, L., Tao, J., Chen, J., Sun, W., Ranjan, R., Kolodziej, J., Streit, A., Georgakopoulos, D.: A security framework in g-hadoop for big data computing across distributed cloud data centres. J. Comput. Syst. Sci. 80(5):994–1007 (2014). cited By (since 1996)0
Metadata
Title
MOMTH: multi-objective scheduling algorithm of many tasks in Hadoop
Authors
Mihaela-Catalina Nita
Florin Pop
Cristiana Voicu
Ciprian Dobre
Fatos Xhafa
Publication date
01-09-2015
Publisher
Springer US
Published in
Cluster Computing / Issue 3/2015
Print ISSN: 1386-7857
Electronic ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-015-0454-8

Other articles of this Issue 3/2015

Cluster Computing 3/2015 Go to the issue

Premium Partner