Skip to main content
Erschienen in:
Buchtitelbild

2018 | OriginalPaper | Buchkapitel

A Vector-Scheduling Approach for Running Many-Task Applications in the Cloud

verfasst von : Brian Peterson, Yalda Fazlalizadeh, Gerald Baumgartner, Qingyang Wang

Erschienen in: Cloud Computing – CLOUD 2018

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The performance variation of cloud resources makes it difficult to run certain scientific applications in the cloud because of their unique synchronization and communication requirements. We propose a decentralized scheduling approach for many-task applications that assigns individual tasks to cloud nodes based on periodic performance measurements of the cloud resources. In this paper, we present a vector-based scheduling algorithm that assigns tasks to nodes based on measuring the compute performance and the queue length of those nodes. Our experiments with a set of tasks in CloudLab show that the application proceeds in three distinct phases: flooding the cloud nodes with tasks, a steady state in which all nodes are busy, and the end game in which the remaining tasks are executed on the fastest nodes. We present heuristics for these three phases and demonstrate with measurements in CloudLab that they result in a reduction of the overall execution time of the many-task application.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Abramson, D., Giddy, J., Kotler, L.: High performance parametric modeling with Nimrod/G: killer application for the global grid? In: Proceedings of International Parallel and Distributed Processing Symposium, pp. 520–528, May 2000 Abramson, D., Giddy, J., Kotler, L.: High performance parametric modeling with Nimrod/G: killer application for the global grid? In: Proceedings of International Parallel and Distributed Processing Symposium, pp. 520–528, May 2000
3.
Zurück zum Zitat Baumgartner, G., Auer, A., Bernholdt, D., Bibireata, A., Choppella, V., Cociorva, D., Gao, X., Harrison, R., Hirata, S., Amoorthy, S.K., Krishnan, S., Lam, C., Lu, Q., Nooijen, M., Pitzer, R., Ramanujam, J., Sadayappan, P., Sibiryakov, A.: Synthesis of high-performance parallel programs for a class of AB initio quantum chemistry models. Proc. IEEE 93(2), 276–292 (2005)CrossRef Baumgartner, G., Auer, A., Bernholdt, D., Bibireata, A., Choppella, V., Cociorva, D., Gao, X., Harrison, R., Hirata, S., Amoorthy, S.K., Krishnan, S., Lam, C., Lu, Q., Nooijen, M., Pitzer, R., Ramanujam, J., Sadayappan, P., Sibiryakov, A.: Synthesis of high-performance parallel programs for a class of AB initio quantum chemistry models. Proc. IEEE 93(2), 276–292 (2005)CrossRef
4.
Zurück zum Zitat Buaklee, D., Tracy, G., Vernon, M.K., Wright, S.: Near-optimal adaptive control of a large Grid application. In: Proceedings of International Conference on Supercomputing, pp. 315–326, June 2002 Buaklee, D., Tracy, G., Vernon, M.K., Wright, S.: Near-optimal adaptive control of a large Grid application. In: Proceedings of International Conference on Supercomputing, pp. 315–326, June 2002
5.
Zurück zum Zitat Chakravarti, A.J., Baumgartner, G., Lauria, M.: The Organic Grid: self-organizing computation on a peer-to-peer network. IEEE Trans. Syst. Man Cybern. Part A 35(3), 373–384 (2005)CrossRef Chakravarti, A.J., Baumgartner, G., Lauria, M.: The Organic Grid: self-organizing computation on a peer-to-peer network. IEEE Trans. Syst. Man Cybern. Part A 35(3), 373–384 (2005)CrossRef
6.
Zurück zum Zitat Chakravarti, A.J., Baumgartner, G., Lauria, M.: Self-organizing scheduling on the Organic Grid. Intl. J. High-Perf. Comput. Appl. 20(1), 115–130 (2006)CrossRef Chakravarti, A.J., Baumgartner, G., Lauria, M.: Self-organizing scheduling on the Organic Grid. Intl. J. High-Perf. Comput. Appl. 20(1), 115–130 (2006)CrossRef
7.
Zurück zum Zitat Chien, A.A., Calder, B., Elbert, S., Bhatia, K.: Entropia: architecture and performance of an enterprise desktop grid system. J. Parallel Distrib. Comput. 63(5), 597–610 (2003)CrossRef Chien, A.A., Calder, B., Elbert, S., Bhatia, K.: Entropia: architecture and performance of an enterprise desktop grid system. J. Parallel Distrib. Comput. 63(5), 597–610 (2003)CrossRef
8.
Zurück zum Zitat Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRef Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRef
9.
Zurück zum Zitat Erdos, P., Rényi, A.: On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci 5(1), 17–60 (1960)MathSciNetMATH Erdos, P., Rényi, A.: On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci 5(1), 17–60 (1960)MathSciNetMATH
10.
Zurück zum Zitat Evangelinos, C., Hill, C.N.: Cloud computing for parallel scientific HPC applications: feasibility of running coupled atmosphere-ocean climate models on Amazon’s EC2. In: 1st Workshop on Cloud Computing and its Applications (CCA) (2008) Evangelinos, C., Hill, C.N.: Cloud computing for parallel scientific HPC applications: feasibility of running coupled atmosphere-ocean climate models on Amazon’s EC2. In: 1st Workshop on Cloud Computing and its Applications (CCA) (2008)
11.
Zurück zum Zitat Grimshaw, A.S., Wulf, W.A.: The Legion vision of a worldwide virtual computer. Commun. ACM 40(1), 39–45 (1997)CrossRef Grimshaw, A.S., Wulf, W.A.: The Legion vision of a worldwide virtual computer. Commun. ACM 40(1), 39–45 (1997)CrossRef
12.
Zurück zum Zitat Gutierrez-Estevez, D.M., Luo, M.: Multi-resource schedulable unit for adaptive application-driven unified resource management in data centers. In: 2015 International Telecommunication Networks and Applications Conference (ITNAC), November 2015 Gutierrez-Estevez, D.M., Luo, M.: Multi-resource schedulable unit for adaptive application-driven unified resource management in data centers. In: 2015 International Telecommunication Networks and Applications Conference (ITNAC), November 2015
13.
Zurück zum Zitat Hartono, A., Lu, Q., Henretty, T., Krishnamoorthy, S., Zhang, H., Baumgartner, G., Bernholdt, D.E., Nooijen, M., Pitzer, R.M., Ramanujam, J., Sadayappan, P.: Performance optimization of tensor contraction expressions for many-body methods in quantum chemistry. J. Phys. Chem. 113(45), 12715–12723 (2009)CrossRef Hartono, A., Lu, Q., Henretty, T., Krishnamoorthy, S., Zhang, H., Baumgartner, G., Bernholdt, D.E., Nooijen, M., Pitzer, R.M., Ramanujam, J., Sadayappan, P.: Performance optimization of tensor contraction expressions for many-body methods in quantum chemistry. J. Phys. Chem. 113(45), 12715–12723 (2009)CrossRef
14.
Zurück zum Zitat Litzkow, M., Livny, M., Mutka, M.: Condor – a hunter of idle workstations. In: Proceedings of the 8th International Conference of Distributed Computing Systems, pp. 104–111, June 1988 Litzkow, M., Livny, M., Mutka, M.: Condor – a hunter of idle workstations. In: Proceedings of the 8th International Conference of Distributed Computing Systems, pp. 104–111, June 1988
15.
Zurück zum Zitat Luo, M., Li, L., Chou, W.: ADARM: an application-driven adaptive resource management framework for data centers. In: 2017 IEEE International Conference on AI and Mobile Services (AIMS), June 2017 Luo, M., Li, L., Chou, W.: ADARM: an application-driven adaptive resource management framework for data centers. In: 2017 IEEE International Conference on AI and Mobile Services (AIMS), June 2017
16.
Zurück zum Zitat Maheswaran, M., Ali, S., Siegel, H.J., Hensgen, D.A., Freund, R.F.: Dynamic matching and scheduling of a class of independent tasks onto heterogeneous computing systems. In: Proceedings of the 8th Heterogeneous Computing Workshop, pp. 30–44, April 1999 Maheswaran, M., Ali, S., Siegel, H.J., Hensgen, D.A., Freund, R.F.: Dynamic matching and scheduling of a class of independent tasks onto heterogeneous computing systems. In: Proceedings of the 8th Heterogeneous Computing Workshop, pp. 30–44, April 1999
17.
Zurück zum Zitat Peterson, B.: Decentralized scheduling for many-task applications in the hybrid cloud. Ph.D. thesis, Louisiana State University, Baton Rouge, LA, May 2017 Peterson, B.: Decentralized scheduling for many-task applications in the hybrid cloud. Ph.D. thesis, Louisiana State University, Baton Rouge, LA, May 2017
18.
Zurück zum Zitat Peterson, B., Baumgartner, G., Wang, Q.: A decentralized scheduling framework for many-task scientific computing in a hybrid cloud. Serv. Trans. Cloud Comput. 5(1), 1–13 (2017) Peterson, B., Baumgartner, G., Wang, Q.: A decentralized scheduling framework for many-task scientific computing in a hybrid cloud. Serv. Trans. Cloud Comput. 5(1), 1–13 (2017)
19.
Zurück zum Zitat Raicu, I., Foster, I.T., Zhao, Y.: Many-task computing for grids and supercomputers. In: Proceedings of the 2008 Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS 2008), Austin, TX, pp. 1–11. IEEE November 2008 Raicu, I., Foster, I.T., Zhao, Y.: Many-task computing for grids and supercomputers. In: Proceedings of the 2008 Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS 2008), Austin, TX, pp. 1–11. IEEE November 2008
20.
Zurück zum Zitat Rajbhandari, S., Nikam, A., Lai, P.W., Stock, K., Krishnamoorthy, S., Sadayappan, P.: A communication-optimal framework for contracting distributed tensors. In: Proceedings of SC 2014, the International Conference on High Performance Computing, Networking, Storage, and Analysis, New Orleans, LA, 16–21 November 2014 (2014) Rajbhandari, S., Nikam, A., Lai, P.W., Stock, K., Krishnamoorthy, S., Sadayappan, P.: A communication-optimal framework for contracting distributed tensors. In: Proceedings of SC 2014, the International Conference on High Performance Computing, Networking, Storage, and Analysis, New Orleans, LA, 16–21 November 2014 (2014)
21.
Zurück zum Zitat Taylor, I., Shields, M., Wang, I.: 1 - resource management of Triana P2P services. In: Nabrzyski, J., Schopf, J.M., Wȩglarz, J. (eds.) Grid Resource Management. Springer, Boston (2003) Taylor, I., Shields, M., Wang, I.: 1 - resource management of Triana P2P services. In: Nabrzyski, J., Schopf, J.M., Wȩglarz, J. (eds.) Grid Resource Management. Springer, Boston (2003)
22.
Zurück zum Zitat Tso, F.P., Jouet, S., Pezaros, D.P.: Network and server resource management strategies for data centre infrastructures: a survey. Comput. Netw. 106(4), 209–225 (2016)CrossRef Tso, F.P., Jouet, S., Pezaros, D.P.: Network and server resource management strategies for data centre infrastructures: a survey. Comput. Netw. 106(4), 209–225 (2016)CrossRef
23.
Zurück zum Zitat Walker, E.: Benchmarking Amazon EC2 for high-performance scientific computing. In: LOGIN, vol. 33, no. 5, pp. 18–23 (2008) Walker, E.: Benchmarking Amazon EC2 for high-performance scientific computing. In: LOGIN, vol. 33, no. 5, pp. 18–23 (2008)
24.
Zurück zum Zitat Xin, R.S., Gonzalez, J.E., Franklin, M.J., Stoica, I.: GraphX: a resilient distributed graph system on Spark. In: First International Workshop on Graph Data Management Experiences and Systems, GRADES 2013, pp. 2:1–2:6. ACM, New York (2013) Xin, R.S., Gonzalez, J.E., Franklin, M.J., Stoica, I.: GraphX: a resilient distributed graph system on Spark. In: First International Workshop on Graph Data Management Experiences and Systems, GRADES 2013, pp. 2:1–2:6. ACM, New York (2013)
Metadaten
Titel
A Vector-Scheduling Approach for Running Many-Task Applications in the Cloud
verfasst von
Brian Peterson
Yalda Fazlalizadeh
Gerald Baumgartner
Qingyang Wang
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-94295-7_1

Premium Partner