Skip to main content
Top

2017 | OriginalPaper | Chapter

Evolving Cut-Off Mechanisms and Other Work-Stealing Parameters for Parallel Programs

Authors : Alcides Fonseca, Nuno Lourenço, Bruno Cabral

Published in: Applications of Evolutionary Computation

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Optimizing parallel programs is a complex task because the interference among many different parameters. Work-stealing runtimes, used to dynamically balance load among different processor cores, are no exception. This work explores the automatic configuration of the following runtime parameters: dynamic granularity control algorithms, granularity control cache, work-stealing algorithm, lazy binary splitting parameter, the maximum queue size and the unparking interval. The performance of the program is highly sensible to the granularity control algorithm, which can be a combination of other granularity algorithms. In this work, we address two search-based problems: finding a globally efficient work-stealing configuration, and finding the best configuration just for an individual program. For both problems, we propose the use of a Genetic Algorithm (GA). The genotype of the GA is able to represent combinations of up to three cut-off algorithms, as well as other work-stealing parameters.
The proposed GA has been evaluated in its ability to obtain a more efficient solution across a set of programs, in its ability to generalize the solution to a larger set of programs, and its ability to evolve single programs individually.
The GA was able to improve the performance of the set of programs in the training set, but the obtained configurations were not generalized to a larger benchmark set. However, it was able to successfully improve the performance of each program individually.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Blumofe, R.D., Joerg, C.F., Kuszmaul, B.C., Leiserson, C.E., Randall, K.H., Zhou, Y.: Cilk: an efficient multithreaded runtime system. J. Parallel Distrib. Comput. 37(1), 55–69 (1996)CrossRef Blumofe, R.D., Joerg, C.F., Kuszmaul, B.C., Leiserson, C.E., Randall, K.H., Zhou, Y.: Cilk: an efficient multithreaded runtime system. J. Parallel Distrib. Comput. 37(1), 55–69 (1996)CrossRef
2.
go back to reference Dagum, L., Menon, R.: Openmp: an industry standard api for shared-memory programming. IEEE Comput. Sci. Eng. 5(1), 46–55 (1998)CrossRef Dagum, L., Menon, R.: Openmp: an industry standard api for shared-memory programming. IEEE Comput. Sci. Eng. 5(1), 46–55 (1998)CrossRef
3.
go back to reference Lea, D.: A java fork/join framework. In: Proceedings of the ACM 2000 Conference on Java Grande, pp. 36–43. ACM (2000) Lea, D.: A java fork/join framework. In: Proceedings of the ACM 2000 Conference on Java Grande, pp. 36–43. ACM (2000)
4.
go back to reference Charles, P., Grothoff, C., Saraswat, V., Donawa, C., Kielstra, A., Ebcioglu, K., Von Praun, C., Sarkar, V.: X10: an object-oriented approach to non-uniform cluster computing. In: ACM Sigplan Notices, vol. 40, pp. 519–538. ACM (2005) Charles, P., Grothoff, C., Saraswat, V., Donawa, C., Kielstra, A., Ebcioglu, K., Von Praun, C., Sarkar, V.: X10: an object-oriented approach to non-uniform cluster computing. In: ACM Sigplan Notices, vol. 40, pp. 519–538. ACM (2005)
5.
go back to reference Stork, S., Naden, K., Sunshine, J., Mohr, M., Fonseca, A., Marques, P., Aldrich, J.: Æminium: a permission-based concurrent-by-default programming language approach. ACM Trans. Program. Lang. Syst. (TOPLAS) 36(1), 2 (2014)CrossRef Stork, S., Naden, K., Sunshine, J., Mohr, M., Fonseca, A., Marques, P., Aldrich, J.: Æminium: a permission-based concurrent-by-default programming language approach. ACM Trans. Program. Lang. Syst. (TOPLAS) 36(1), 2 (2014)CrossRef
6.
go back to reference Frigo, M., Leiserson, C.E., Randall, K.H.: The implementation of the cilk-5 multithreaded language. In: ACM Sigplan Notices, vol. 33, pp. 212–223. ACM (1998) Frigo, M., Leiserson, C.E., Randall, K.H.: The implementation of the cilk-5 multithreaded language. In: ACM Sigplan Notices, vol. 33, pp. 212–223. ACM (1998)
7.
go back to reference Mohr, E., Kranz, D.A., Halstead, R.H.: Lazy task creation: a technique for increasing the granularity of parallel programs. IEEE Trans. Parallel Distrib. Syst. 2(3), 264–280 (1991)CrossRef Mohr, E., Kranz, D.A., Halstead, R.H.: Lazy task creation: a technique for increasing the granularity of parallel programs. IEEE Trans. Parallel Distrib. Syst. 2(3), 264–280 (1991)CrossRef
8.
go back to reference Duran, A., Corbalán, J., Ayguadé, E.: Evaluation of OpenMP task scheduling strategies. In: Eigenmann, R., Supinski, B.R. (eds.) IWOMP 2008. LNCS, vol. 5004, pp. 100–110. Springer, Heidelberg (2008). doi:10.1007/978-3-540-79561-2_9CrossRef Duran, A., Corbalán, J., Ayguadé, E.: Evaluation of OpenMP task scheduling strategies. In: Eigenmann, R., Supinski, B.R. (eds.) IWOMP 2008. LNCS, vol. 5004, pp. 100–110. Springer, Heidelberg (2008). doi:10.​1007/​978-3-540-79561-2_​9CrossRef
9.
go back to reference Duran, A., Corbalán, J., Ayguadé, E.: An adaptive cut-off for task parallelism. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, p. 36. IEEE Press (2008) Duran, A., Corbalán, J., Ayguadé, E.: An adaptive cut-off for task parallelism. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, p. 36. IEEE Press (2008)
10.
go back to reference Fonseca, A., Cabral, B.: Evaluation of runtime cut-off approaches for parallel programs. In: VECPAR 2016 Proceedings (2016) Fonseca, A., Cabral, B.: Evaluation of runtime cut-off approaches for parallel programs. In: VECPAR 2016 Proceedings (2016)
11.
go back to reference Miller, B.L., Goldberg, D.E.: Genetic algorithms, tournament selection, and the effects of noise. Complex Syst. 9(3), 193–212 (1995)MathSciNet Miller, B.L., Goldberg, D.E.: Genetic algorithms, tournament selection, and the effects of noise. Complex Syst. 9(3), 193–212 (1995)MathSciNet
12.
go back to reference DeJong, K.: An analysis of the behavior of a class of genetic adaptive systems. Ph.D. Thesis, University of Michigan (1975) DeJong, K.: An analysis of the behavior of a class of genetic adaptive systems. Ph.D. Thesis, University of Michigan (1975)
13.
go back to reference Olivier, S.L., Prins, J.F.: Evaluating OpenMP 3.0 run time systems on unbalanced task graphs. In: Müller, M.S., Supinski, B.R., Chapman, B.M. (eds.) IWOMP 2009. LNCS, vol. 5568, pp. 63–78. Springer, Heidelberg (2009). doi:10.1007/978-3-642-02303-3_6CrossRef Olivier, S.L., Prins, J.F.: Evaluating OpenMP 3.0 run time systems on unbalanced task graphs. In: Müller, M.S., Supinski, B.R., Chapman, B.M. (eds.) IWOMP 2009. LNCS, vol. 5568, pp. 63–78. Springer, Heidelberg (2009). doi:10.​1007/​978-3-642-02303-3_​6CrossRef
14.
go back to reference Tchiboukdjian, M., Danjean, V., Gautier, T., Mentec, F., Raffin, B.: A work stealing scheduler for parallel loops on shared cache multicores. In: Guarracino, M.R., et al. (eds.) Euro-Par 2010. LNCS, vol. 6586, pp. 99–107. Springer, Heidelberg (2011). doi:10.1007/978-3-642-21878-1_13CrossRef Tchiboukdjian, M., Danjean, V., Gautier, T., Mentec, F., Raffin, B.: A work stealing scheduler for parallel loops on shared cache multicores. In: Guarracino, M.R., et al. (eds.) Euro-Par 2010. LNCS, vol. 6586, pp. 99–107. Springer, Heidelberg (2011). doi:10.​1007/​978-3-642-21878-1_​13CrossRef
15.
go back to reference Cong, G., Kodali, S., Krishnamoorthy, S., Lea, D., Saraswat, V., Wen, T.: Solving large, irregular graph problems using adaptive work-stealing. In: 2008 37th International Conference on Parallel Processing, pp. 536–545. IEEE (2008) Cong, G., Kodali, S., Krishnamoorthy, S., Lea, D., Saraswat, V., Wen, T.: Solving large, irregular graph problems using adaptive work-stealing. In: 2008 37th International Conference on Parallel Processing, pp. 536–545. IEEE (2008)
16.
go back to reference Wang, L., Cui, H., Duan, Y., Lu, F., Feng, X., Yew, P.C.: An adaptive task creation strategy for work-stealing scheduling. In: Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization, pp. 266–277. ACM (2010) Wang, L., Cui, H., Duan, Y., Lu, F., Feng, X., Yew, P.C.: An adaptive task creation strategy for work-stealing scheduling. In: Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization, pp. 266–277. ACM (2010)
17.
go back to reference Chen, S., Gibbons, P.B., Kozuch, M., Liaskovitis, V., Ailamaki, A., Blelloch, G.E., Falsafi, B., Fix, L., Hardavellas, N., Mowry, T.C., et al.: Scheduling threads for constructive cache sharing on cmps. In: Proceedings of the Nineteenth Annual ACM Symposium on Parallel Algorithms and Architectures, pp. 105–115. ACM (2007) Chen, S., Gibbons, P.B., Kozuch, M., Liaskovitis, V., Ailamaki, A., Blelloch, G.E., Falsafi, B., Fix, L., Hardavellas, N., Mowry, T.C., et al.: Scheduling threads for constructive cache sharing on cmps. In: Proceedings of the Nineteenth Annual ACM Symposium on Parallel Algorithms and Architectures, pp. 105–115. ACM (2007)
18.
go back to reference Ahmad, I., Dhodhi, M.K.: Multiprocessor scheduling in a genetic paradigm. Parallel Comput. 22(3), 395–406 (1996)CrossRefMATH Ahmad, I., Dhodhi, M.K.: Multiprocessor scheduling in a genetic paradigm. Parallel Comput. 22(3), 395–406 (1996)CrossRefMATH
19.
go back to reference Kwok, Y.K., Ahmad, I.: Efficient scheduling of arbitrary task graphs to multiprocessors using a parallel genetic algorithm. J. Parallel Distrib. Comput. 47(1), 58–77 (1997)CrossRef Kwok, Y.K., Ahmad, I.: Efficient scheduling of arbitrary task graphs to multiprocessors using a parallel genetic algorithm. J. Parallel Distrib. Comput. 47(1), 58–77 (1997)CrossRef
20.
go back to reference Wang, L., Siegel, H.J., Roychowdhury, V.P., Maciejewski, A.A.: Task matching and scheduling in heterogeneous computing environments using a genetic-algorithm-based approach. J. Parallel Distrib. Comput. 47(1), 8–22 (1997)CrossRef Wang, L., Siegel, H.J., Roychowdhury, V.P., Maciejewski, A.A.: Task matching and scheduling in heterogeneous computing environments using a genetic-algorithm-based approach. J. Parallel Distrib. Comput. 47(1), 8–22 (1997)CrossRef
21.
go back to reference Corrêa, R.C., Ferreira, A., Rebreyend, P.: Scheduling multiprocessor tasks with genetic algorithms. IEEE Trans. Parallel Distrib. Syst. 10(8), 825–837 (1999)CrossRef Corrêa, R.C., Ferreira, A., Rebreyend, P.: Scheduling multiprocessor tasks with genetic algorithms. IEEE Trans. Parallel Distrib. Syst. 10(8), 825–837 (1999)CrossRef
22.
go back to reference Omara, F.A., Arafa, M.M.: Genetic algorithms for task scheduling problem. J. Parallel Distrib. Comput. 70(1), 13–22 (2010)CrossRefMATH Omara, F.A., Arafa, M.M.: Genetic algorithms for task scheduling problem. J. Parallel Distrib. Comput. 70(1), 13–22 (2010)CrossRefMATH
23.
go back to reference Mezmaz, M., Melab, N., Kessaci, Y., Lee, Y.C., Talbi, E.G., Zomaya, A.Y., Tuyttens, D.: A parallel bi-objective hybrid metaheuristic for energy-aware scheduling for cloud computing systems. J. Parallel Distrib. Comput. 71(11), 1497–1508 (2011)CrossRef Mezmaz, M., Melab, N., Kessaci, Y., Lee, Y.C., Talbi, E.G., Zomaya, A.Y., Tuyttens, D.: A parallel bi-objective hybrid metaheuristic for energy-aware scheduling for cloud computing systems. J. Parallel Distrib. Comput. 71(11), 1497–1508 (2011)CrossRef
24.
go back to reference Sheikh, H.F., Ahmad, I., Fan, D.: An evolutionary technique for performance-energy-temperature optimized scheduling of parallel tasks on multi-core processors. IEEE Trans. Parallel Distrib. Syst. 27(3), 668–681 (2016)CrossRef Sheikh, H.F., Ahmad, I., Fan, D.: An evolutionary technique for performance-energy-temperature optimized scheduling of parallel tasks on multi-core processors. IEEE Trans. Parallel Distrib. Syst. 27(3), 668–681 (2016)CrossRef
25.
go back to reference Langdon, W.B., Harman, M.: Genetically improved CUDA C++ software. In: Nicolau, M., Krawiec, K., Heywood, M.I., Castelli, M., García-Sánchez, P., Merelo, J.J., Rivas Santos, V.M., Sim, K. (eds.) EuroGP 2014. LNCS, vol. 8599, pp. 87–99. Springer, Heidelberg (2014). doi:10.1007/978-3-662-44303-3_8 Langdon, W.B., Harman, M.: Genetically improved CUDA C++ software. In: Nicolau, M., Krawiec, K., Heywood, M.I., Castelli, M., García-Sánchez, P., Merelo, J.J., Rivas Santos, V.M., Sim, K. (eds.) EuroGP 2014. LNCS, vol. 8599, pp. 87–99. Springer, Heidelberg (2014). doi:10.​1007/​978-3-662-44303-3_​8
26.
go back to reference Le Goues, C., Nguyen, T., Forrest, S., Weimer, W.: Genprog: a generic method for automatic software repair. IEEE Trans. Software Eng. 38(1), 54–72 (2012)CrossRef Le Goues, C., Nguyen, T., Forrest, S., Weimer, W.: Genprog: a generic method for automatic software repair. IEEE Trans. Software Eng. 38(1), 54–72 (2012)CrossRef
27.
go back to reference Ryan, C., Ivan, L., Koza, J.R., Banzhaf, W.: Automatic parallelization of loops in sequential programs using genetic programming. In: Genetic Programming 1998: Proceedings of the Third, pp. 344–349. Morgan Kaufmann (1998) Ryan, C., Ivan, L., Koza, J.R., Banzhaf, W.: Automatic parallelization of loops in sequential programs using genetic programming. In: Genetic Programming 1998: Proceedings of the Third, pp. 344–349. Morgan Kaufmann (1998)
28.
go back to reference Ryan, C., Ivan, L.: Automatic parallelization of arbitrary programs. In: Poli, R., Nordin, P., Langdon, W.B., Fogarty, T.C. (eds.) EuroGP 1999. LNCS, vol. 1598, pp. 244–254. Springer, Heidelberg (1999). doi:10.1007/3-540-48885-5_21CrossRef Ryan, C., Ivan, L.: Automatic parallelization of arbitrary programs. In: Poli, R., Nordin, P., Langdon, W.B., Fogarty, T.C. (eds.) EuroGP 1999. LNCS, vol. 1598, pp. 244–254. Springer, Heidelberg (1999). doi:10.​1007/​3-540-48885-5_​21CrossRef
Metadata
Title
Evolving Cut-Off Mechanisms and Other Work-Stealing Parameters for Parallel Programs
Authors
Alcides Fonseca
Nuno Lourenço
Bruno Cabral
Copyright Year
2017
DOI
https://doi.org/10.1007/978-3-319-55849-3_49

Premium Partner