Skip to main content
Top
Published in:
Cover of the book

2018 | OriginalPaper | Chapter

A Vector-Scheduling Approach for Running Many-Task Applications in the Cloud

Authors : Brian Peterson, Yalda Fazlalizadeh, Gerald Baumgartner, Qingyang Wang

Published in: Cloud Computing – CLOUD 2018

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The performance variation of cloud resources makes it difficult to run certain scientific applications in the cloud because of their unique synchronization and communication requirements. We propose a decentralized scheduling approach for many-task applications that assigns individual tasks to cloud nodes based on periodic performance measurements of the cloud resources. In this paper, we present a vector-based scheduling algorithm that assigns tasks to nodes based on measuring the compute performance and the queue length of those nodes. Our experiments with a set of tasks in CloudLab show that the application proceeds in three distinct phases: flooding the cloud nodes with tasks, a steady state in which all nodes are busy, and the end game in which the remaining tasks are executed on the fastest nodes. We present heuristics for these three phases and demonstrate with measurements in CloudLab that they result in a reduction of the overall execution time of the many-task application.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Abramson, D., Giddy, J., Kotler, L.: High performance parametric modeling with Nimrod/G: killer application for the global grid? In: Proceedings of International Parallel and Distributed Processing Symposium, pp. 520–528, May 2000 Abramson, D., Giddy, J., Kotler, L.: High performance parametric modeling with Nimrod/G: killer application for the global grid? In: Proceedings of International Parallel and Distributed Processing Symposium, pp. 520–528, May 2000
3.
go back to reference Baumgartner, G., Auer, A., Bernholdt, D., Bibireata, A., Choppella, V., Cociorva, D., Gao, X., Harrison, R., Hirata, S., Amoorthy, S.K., Krishnan, S., Lam, C., Lu, Q., Nooijen, M., Pitzer, R., Ramanujam, J., Sadayappan, P., Sibiryakov, A.: Synthesis of high-performance parallel programs for a class of AB initio quantum chemistry models. Proc. IEEE 93(2), 276–292 (2005)CrossRef Baumgartner, G., Auer, A., Bernholdt, D., Bibireata, A., Choppella, V., Cociorva, D., Gao, X., Harrison, R., Hirata, S., Amoorthy, S.K., Krishnan, S., Lam, C., Lu, Q., Nooijen, M., Pitzer, R., Ramanujam, J., Sadayappan, P., Sibiryakov, A.: Synthesis of high-performance parallel programs for a class of AB initio quantum chemistry models. Proc. IEEE 93(2), 276–292 (2005)CrossRef
4.
go back to reference Buaklee, D., Tracy, G., Vernon, M.K., Wright, S.: Near-optimal adaptive control of a large Grid application. In: Proceedings of International Conference on Supercomputing, pp. 315–326, June 2002 Buaklee, D., Tracy, G., Vernon, M.K., Wright, S.: Near-optimal adaptive control of a large Grid application. In: Proceedings of International Conference on Supercomputing, pp. 315–326, June 2002
5.
go back to reference Chakravarti, A.J., Baumgartner, G., Lauria, M.: The Organic Grid: self-organizing computation on a peer-to-peer network. IEEE Trans. Syst. Man Cybern. Part A 35(3), 373–384 (2005)CrossRef Chakravarti, A.J., Baumgartner, G., Lauria, M.: The Organic Grid: self-organizing computation on a peer-to-peer network. IEEE Trans. Syst. Man Cybern. Part A 35(3), 373–384 (2005)CrossRef
6.
go back to reference Chakravarti, A.J., Baumgartner, G., Lauria, M.: Self-organizing scheduling on the Organic Grid. Intl. J. High-Perf. Comput. Appl. 20(1), 115–130 (2006)CrossRef Chakravarti, A.J., Baumgartner, G., Lauria, M.: Self-organizing scheduling on the Organic Grid. Intl. J. High-Perf. Comput. Appl. 20(1), 115–130 (2006)CrossRef
7.
go back to reference Chien, A.A., Calder, B., Elbert, S., Bhatia, K.: Entropia: architecture and performance of an enterprise desktop grid system. J. Parallel Distrib. Comput. 63(5), 597–610 (2003)CrossRef Chien, A.A., Calder, B., Elbert, S., Bhatia, K.: Entropia: architecture and performance of an enterprise desktop grid system. J. Parallel Distrib. Comput. 63(5), 597–610 (2003)CrossRef
8.
go back to reference Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRef Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRef
9.
go back to reference Erdos, P., Rényi, A.: On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci 5(1), 17–60 (1960)MathSciNetMATH Erdos, P., Rényi, A.: On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci 5(1), 17–60 (1960)MathSciNetMATH
10.
go back to reference Evangelinos, C., Hill, C.N.: Cloud computing for parallel scientific HPC applications: feasibility of running coupled atmosphere-ocean climate models on Amazon’s EC2. In: 1st Workshop on Cloud Computing and its Applications (CCA) (2008) Evangelinos, C., Hill, C.N.: Cloud computing for parallel scientific HPC applications: feasibility of running coupled atmosphere-ocean climate models on Amazon’s EC2. In: 1st Workshop on Cloud Computing and its Applications (CCA) (2008)
11.
go back to reference Grimshaw, A.S., Wulf, W.A.: The Legion vision of a worldwide virtual computer. Commun. ACM 40(1), 39–45 (1997)CrossRef Grimshaw, A.S., Wulf, W.A.: The Legion vision of a worldwide virtual computer. Commun. ACM 40(1), 39–45 (1997)CrossRef
12.
go back to reference Gutierrez-Estevez, D.M., Luo, M.: Multi-resource schedulable unit for adaptive application-driven unified resource management in data centers. In: 2015 International Telecommunication Networks and Applications Conference (ITNAC), November 2015 Gutierrez-Estevez, D.M., Luo, M.: Multi-resource schedulable unit for adaptive application-driven unified resource management in data centers. In: 2015 International Telecommunication Networks and Applications Conference (ITNAC), November 2015
13.
go back to reference Hartono, A., Lu, Q., Henretty, T., Krishnamoorthy, S., Zhang, H., Baumgartner, G., Bernholdt, D.E., Nooijen, M., Pitzer, R.M., Ramanujam, J., Sadayappan, P.: Performance optimization of tensor contraction expressions for many-body methods in quantum chemistry. J. Phys. Chem. 113(45), 12715–12723 (2009)CrossRef Hartono, A., Lu, Q., Henretty, T., Krishnamoorthy, S., Zhang, H., Baumgartner, G., Bernholdt, D.E., Nooijen, M., Pitzer, R.M., Ramanujam, J., Sadayappan, P.: Performance optimization of tensor contraction expressions for many-body methods in quantum chemistry. J. Phys. Chem. 113(45), 12715–12723 (2009)CrossRef
14.
go back to reference Litzkow, M., Livny, M., Mutka, M.: Condor – a hunter of idle workstations. In: Proceedings of the 8th International Conference of Distributed Computing Systems, pp. 104–111, June 1988 Litzkow, M., Livny, M., Mutka, M.: Condor – a hunter of idle workstations. In: Proceedings of the 8th International Conference of Distributed Computing Systems, pp. 104–111, June 1988
15.
go back to reference Luo, M., Li, L., Chou, W.: ADARM: an application-driven adaptive resource management framework for data centers. In: 2017 IEEE International Conference on AI and Mobile Services (AIMS), June 2017 Luo, M., Li, L., Chou, W.: ADARM: an application-driven adaptive resource management framework for data centers. In: 2017 IEEE International Conference on AI and Mobile Services (AIMS), June 2017
16.
go back to reference Maheswaran, M., Ali, S., Siegel, H.J., Hensgen, D.A., Freund, R.F.: Dynamic matching and scheduling of a class of independent tasks onto heterogeneous computing systems. In: Proceedings of the 8th Heterogeneous Computing Workshop, pp. 30–44, April 1999 Maheswaran, M., Ali, S., Siegel, H.J., Hensgen, D.A., Freund, R.F.: Dynamic matching and scheduling of a class of independent tasks onto heterogeneous computing systems. In: Proceedings of the 8th Heterogeneous Computing Workshop, pp. 30–44, April 1999
17.
go back to reference Peterson, B.: Decentralized scheduling for many-task applications in the hybrid cloud. Ph.D. thesis, Louisiana State University, Baton Rouge, LA, May 2017 Peterson, B.: Decentralized scheduling for many-task applications in the hybrid cloud. Ph.D. thesis, Louisiana State University, Baton Rouge, LA, May 2017
18.
go back to reference Peterson, B., Baumgartner, G., Wang, Q.: A decentralized scheduling framework for many-task scientific computing in a hybrid cloud. Serv. Trans. Cloud Comput. 5(1), 1–13 (2017) Peterson, B., Baumgartner, G., Wang, Q.: A decentralized scheduling framework for many-task scientific computing in a hybrid cloud. Serv. Trans. Cloud Comput. 5(1), 1–13 (2017)
19.
go back to reference Raicu, I., Foster, I.T., Zhao, Y.: Many-task computing for grids and supercomputers. In: Proceedings of the 2008 Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS 2008), Austin, TX, pp. 1–11. IEEE November 2008 Raicu, I., Foster, I.T., Zhao, Y.: Many-task computing for grids and supercomputers. In: Proceedings of the 2008 Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS 2008), Austin, TX, pp. 1–11. IEEE November 2008
20.
go back to reference Rajbhandari, S., Nikam, A., Lai, P.W., Stock, K., Krishnamoorthy, S., Sadayappan, P.: A communication-optimal framework for contracting distributed tensors. In: Proceedings of SC 2014, the International Conference on High Performance Computing, Networking, Storage, and Analysis, New Orleans, LA, 16–21 November 2014 (2014) Rajbhandari, S., Nikam, A., Lai, P.W., Stock, K., Krishnamoorthy, S., Sadayappan, P.: A communication-optimal framework for contracting distributed tensors. In: Proceedings of SC 2014, the International Conference on High Performance Computing, Networking, Storage, and Analysis, New Orleans, LA, 16–21 November 2014 (2014)
21.
go back to reference Taylor, I., Shields, M., Wang, I.: 1 - resource management of Triana P2P services. In: Nabrzyski, J., Schopf, J.M., Wȩglarz, J. (eds.) Grid Resource Management. Springer, Boston (2003) Taylor, I., Shields, M., Wang, I.: 1 - resource management of Triana P2P services. In: Nabrzyski, J., Schopf, J.M., Wȩglarz, J. (eds.) Grid Resource Management. Springer, Boston (2003)
22.
go back to reference Tso, F.P., Jouet, S., Pezaros, D.P.: Network and server resource management strategies for data centre infrastructures: a survey. Comput. Netw. 106(4), 209–225 (2016)CrossRef Tso, F.P., Jouet, S., Pezaros, D.P.: Network and server resource management strategies for data centre infrastructures: a survey. Comput. Netw. 106(4), 209–225 (2016)CrossRef
23.
go back to reference Walker, E.: Benchmarking Amazon EC2 for high-performance scientific computing. In: LOGIN, vol. 33, no. 5, pp. 18–23 (2008) Walker, E.: Benchmarking Amazon EC2 for high-performance scientific computing. In: LOGIN, vol. 33, no. 5, pp. 18–23 (2008)
24.
go back to reference Xin, R.S., Gonzalez, J.E., Franklin, M.J., Stoica, I.: GraphX: a resilient distributed graph system on Spark. In: First International Workshop on Graph Data Management Experiences and Systems, GRADES 2013, pp. 2:1–2:6. ACM, New York (2013) Xin, R.S., Gonzalez, J.E., Franklin, M.J., Stoica, I.: GraphX: a resilient distributed graph system on Spark. In: First International Workshop on Graph Data Management Experiences and Systems, GRADES 2013, pp. 2:1–2:6. ACM, New York (2013)
Metadata
Title
A Vector-Scheduling Approach for Running Many-Task Applications in the Cloud
Authors
Brian Peterson
Yalda Fazlalizadeh
Gerald Baumgartner
Qingyang Wang
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-94295-7_1

Premium Partner