Skip to main content

2015 | OriginalPaper | Buchkapitel

Machine-Learning-Based Load Balancing for Community Ice Code Component in CESM

verfasst von : Prasanna Balaprakash, Yuri Alexeev, Sheri A. Mickelson, Sven Leyffer, Robert Jacob, Anthony Craig

Erschienen in: High Performance Computing for Computational Science -- VECPAR 2014

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Load balancing scientific codes on massively parallel architectures is becoming an increasingly challenging task. In this paper, we focus on the Community Earth System Model, a widely used climate modeling code. It comprises six components each of which exhibits different scalability patterns. Previously, an analytical performance model has been used to find optimal load-balancing parameter configurations for each component. Nevertheless, for the Community Ice Code component, the analytical performance model is too restrictive to capture its scalability patterns. We therefore developed machine-learning-based load-balancing algorithm. It involves fitting a surrogate model to a small number of load-balancing configurations and their corresponding runtimes. This model is then used to find high-quality parameter configurations. Compared with the current practice of expert-knowledge-based enumeration over feasible configurations, the machine-learning-based load-balancing algorithm requires six times fewer evaluations to find the optimal configuration.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Metz, B., Davidson, O., Bosch, P., Dave, R., Meyer, L.: Contribution of working group III to the fourth assessment report of the Intergovernmental Panel on Climate Change (2007) Metz, B., Davidson, O., Bosch, P., Dave, R., Meyer, L.: Contribution of working group III to the fourth assessment report of the Intergovernmental Panel on Climate Change (2007)
4.
Zurück zum Zitat Bishop, C.M., et al.: Pattern Recognition And Machine Learning. Springer, New York (2006)MATH Bishop, C.M., et al.: Pattern Recognition And Machine Learning. Springer, New York (2006)MATH
6.
Zurück zum Zitat Hearst, M.A., Dumais, S., Osman, E., Platt, J., Scholkopf, B.: Support vector machines. Intell. Syst. Appl. 13(4), 18–28 (1998). IEEECrossRef Hearst, M.A., Dumais, S., Osman, E., Platt, J., Scholkopf, B.: Support vector machines. Intell. Syst. Appl. 13(4), 18–28 (1998). IEEECrossRef
7.
Zurück zum Zitat Rasmussen, C.E., Williams, C.K.: Gaussian Processes For Machine Learning. adaptive computation and machine learning. MIT Press, Cambridge (2005) Rasmussen, C.E., Williams, C.K.: Gaussian Processes For Machine Learning. adaptive computation and machine learning. MIT Press, Cambridge (2005)
8.
Zurück zum Zitat Haykin, S.: Neural Networks: A Comprehensive Foundation, 1st edn. Prentice Hall PTR, Upper Saddle River (1994)MATH Haykin, S.: Neural Networks: A Comprehensive Foundation, 1st edn. Prentice Hall PTR, Upper Saddle River (1994)MATH
9.
Zurück zum Zitat Atkinson, E.J., Therneau, T.M.: An Introduction To Recursive Partitioning Using The Rpart Routines. Mayo Foundation, Rochester (2000) Atkinson, E.J., Therneau, T.M.: An Introduction To Recursive Partitioning Using The Rpart Routines. Mayo Foundation, Rochester (2000)
11.
Zurück zum Zitat Kale, L.V., Krishnan, S.: CHARM++: a portable concurrent object oriented system based on C++. ACM SIGPLAN Not. 28(10), 91–108 (1993)CrossRef Kale, L.V., Krishnan, S.: CHARM++: a portable concurrent object oriented system based on C++. ACM SIGPLAN Not. 28(10), 91–108 (1993)CrossRef
12.
Zurück zum Zitat Barker, K., Chernikov, A., Chrisochoides, N., Pingali, K.: A load balancingframework for adaptive and asynchronous applications. IEEE Trans. Parallel Distrib. Syst. 15(2), 183–192 (2004)CrossRef Barker, K., Chernikov, A., Chrisochoides, N., Pingali, K.: A load balancingframework for adaptive and asynchronous applications. IEEE Trans. Parallel Distrib. Syst. 15(2), 183–192 (2004)CrossRef
13.
Zurück zum Zitat Barker, K.J., Chrisochoides, N.P.: An evaluation of a framework for the dynamic load balancing of highly adaptive and irregular parallel applications. In: Proceedings of the 2003 ACM/IEEE Conference on Supercomputing, p. 45. ACM (2003) Barker, K.J., Chrisochoides, N.P.: An evaluation of a framework for the dynamic load balancing of highly adaptive and irregular parallel applications. In: Proceedings of the 2003 ACM/IEEE Conference on Supercomputing, p. 45. ACM (2003)
14.
Zurück zum Zitat Huang, C., Zheng, G., Kalé, L., Kumar, S.: Performance evaluation of adaptive MPI. In: Proceedings of the Eleventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 12–21. ACM (2006) Huang, C., Zheng, G., Kalé, L., Kumar, S.: Performance evaluation of adaptive MPI. In: Proceedings of the Eleventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 12–21. ACM (2006)
15.
Zurück zum Zitat Boneti, C., Gioiosa, R., Cazorla, F.J., Valero, M.: A dynamic scheduler for balancing HPC applications. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, p. 41. IEEE Press (2008) Boneti, C., Gioiosa, R., Cazorla, F.J., Valero, M.: A dynamic scheduler for balancing HPC applications. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, p. 41. IEEE Press (2008)
16.
Zurück zum Zitat Sharma, R., Kanungo, P.: Dynamic load balancing algorithm for heterogeneous multi-core processors cluster. In: 2014 Fourth International Conference on Communication Systems and Network Technologies (CSNT), pp. 288–292. IEEE (2014) Sharma, R., Kanungo, P.: Dynamic load balancing algorithm for heterogeneous multi-core processors cluster. In: 2014 Fourth International Conference on Communication Systems and Network Technologies (CSNT), pp. 288–292. IEEE (2014)
17.
Zurück zum Zitat Hu, Y., Blake, R.: Load balancing for unstructured mesh applications. Parallel Distrib. Comput. Pract. 2(3), 117–148 (1999) Hu, Y., Blake, R.: Load balancing for unstructured mesh applications. Parallel Distrib. Comput. Pract. 2(3), 117–148 (1999)
18.
Zurück zum Zitat Braun, T.D., et al.: A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems. J. Parallel Distrib. Comput. 61(6), 810–837 (2001)CrossRef Braun, T.D., et al.: A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems. J. Parallel Distrib. Comput. 61(6), 810–837 (2001)CrossRef
19.
Zurück zum Zitat Ichikawa, S., Yamashita, S.: Static load balancing of parallel PDE solver for distributed computing environment. In: Proceedings of the 13th International Conference on Parallel and Distributed Computing Systems, pp. 399–405 (2000) Ichikawa, S., Yamashita, S.: Static load balancing of parallel PDE solver for distributed computing environment. In: Proceedings of the 13th International Conference on Parallel and Distributed Computing Systems, pp. 399–405 (2000)
20.
Zurück zum Zitat Effatparvar, M., Garshasbi, M.: A genetic algorithm for static load balancing in parallel heterogeneous systems. Procedia Soc. Behav. Sci. 129, 358–364 (2014)CrossRef Effatparvar, M., Garshasbi, M.: A genetic algorithm for static load balancing in parallel heterogeneous systems. Procedia Soc. Behav. Sci. 129, 358–364 (2014)CrossRef
21.
Zurück zum Zitat Balaprakash, P., Wild, S.M., Hovland, P.D.: Can search algorithms save large-scale automatic performance tuning? In: International Conference on Computational Science (2011) Balaprakash, P., Wild, S.M., Hovland, P.D.: Can search algorithms save large-scale automatic performance tuning? In: International Conference on Computational Science (2011)
22.
Zurück zum Zitat Jia, Y., Sun, J.-Z.: A load balance service based on probabilistic neural network. In: International Conference on Machine Learning and Cybernetics, vol. 3, pp. 1333–1336. IEEE (2003) Jia, Y., Sun, J.-Z.: A load balance service based on probabilistic neural network. In: International Conference on Machine Learning and Cybernetics, vol. 3, pp. 1333–1336. IEEE (2003)
23.
Zurück zum Zitat Dantas, M.A., Pinto, A.R.: A load balancing approach based on a geneticmachine learning algorithm. In: 19th International Symposium on HighPerformance Computing Systems and Applications (HPCS 2005), pp. 124–130. IEEE (2005) Dantas, M.A., Pinto, A.R.: A load balancing approach based on a geneticmachine learning algorithm. In: 19th International Symposium on HighPerformance Computing Systems and Applications (HPCS 2005), pp. 124–130. IEEE (2005)
24.
Zurück zum Zitat Helmy, T., Shahab, S.A.: Machine learning-based adaptive load balancing framework for distributed object computing. In: Chung, Y.-C., Moreira, J.E. (eds.) GPC 2006. LNCS, vol. 3947, pp. 488–497. Springer, Heidelberg (2006) CrossRef Helmy, T., Shahab, S.A.: Machine learning-based adaptive load balancing framework for distributed object computing. In: Chung, Y.-C., Moreira, J.E. (eds.) GPC 2006. LNCS, vol. 3947, pp. 488–497. Springer, Heidelberg (2006) CrossRef
Metadaten
Titel
Machine-Learning-Based Load Balancing for Community Ice Code Component in CESM
verfasst von
Prasanna Balaprakash
Yuri Alexeev
Sheri A. Mickelson
Sven Leyffer
Robert Jacob
Anthony Craig
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-17353-5_7