Skip to main content
Top
Published in: International Journal of Parallel Programming 6/2016

01-12-2016

Automatic Parallelization: Executing Sequential Programs on a Task-Based Parallel Runtime

Authors: Alcides Fonseca, Bruno Cabral, João Rafael, Ivo Correia

Published in: International Journal of Parallel Programming | Issue 6/2016

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

There are billions of lines of sequential code inside nowadays’ software which do not benefit from the parallelism available in modern multicore architectures. Automatically parallelizing sequential code, to promote an efficient use of the available parallelism, has been a research goal for some time now. This work proposes a new approach for achieving such goal. We created a new parallelizing compiler that analyses the read and write instructions, and control-flow modifications in programs to identify a set of dependencies between the instructions in the program. Afterwards, the compiler, based on the generated dependencies graph, rewrites and organizes the program in a task-oriented structure. Parallel tasks are composed by instructions that cannot be executed in parallel. A work-stealing-based parallel runtime is responsible for scheduling and managing the granularity of the generated tasks. Furthermore, a compile-time granularity control mechanism also avoids creating unnecessary data-structures. This work focuses on the Java language, but the techniques are general enough to be applied to other programming languages. We have evaluated our approach on 8 benchmark programs against OoOJava, achieving higher speedups. In some cases, values were close to those of a manual parallelization. The resulting parallel code also has the advantage of being readable and easily configured to improve further its performance manually.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Amini, M., Creusillet, B., Even, S., Keryell, R., Goubier, O., Guelton, S., McMahon, J.O., Pasquier, F.X., Péan, G., Villalon, P.: Par4all: from convex array regions to heterogeneous computing. In: IMPACT 2012: Second International Workshop on Polyhedral Compilation Techniques HiPEAC 2012 (2012) Amini, M., Creusillet, B., Even, S., Keryell, R., Goubier, O., Guelton, S., McMahon, J.O., Pasquier, F.X., Péan, G., Villalon, P.: Par4all: from convex array regions to heterogeneous computing. In: IMPACT 2012: Second International Workshop on Polyhedral Compilation Techniques HiPEAC 2012 (2012)
2.
go back to reference Ayguadé, E., Copty, N., Duran, A., Hoeflinger, J., Lin, Y., Massaioli, F., Teruel, X., Unnikrishnan, P., Zhang, G.: The design of openmp tasks. IEEE Trans. Parallel Distrib. Syst. 20(3), 404–418 (2009)CrossRef Ayguadé, E., Copty, N., Duran, A., Hoeflinger, J., Lin, Y., Massaioli, F., Teruel, X., Unnikrishnan, P., Zhang, G.: The design of openmp tasks. IEEE Trans. Parallel Distrib. Syst. 20(3), 404–418 (2009)CrossRef
3.
go back to reference Ayguadé, E., Duran, A., Hoeflinger, J., Massaioli, F., Teruel, X.: An experimental evaluation of the new openmp tasking model. In: Adve, V., Garzarán, M.J., Petersen, P. (eds.) Languages and Compilers for Parallel Computing, pp. 63–77. Springer, Berlin (2008) Ayguadé, E., Duran, A., Hoeflinger, J., Massaioli, F., Teruel, X.: An experimental evaluation of the new openmp tasking model. In: Adve, V., Garzarán, M.J., Petersen, P. (eds.) Languages and Compilers for Parallel Computing, pp. 63–77. Springer, Berlin (2008)
4.
go back to reference Banerjee, U., Eigenmann, R., Nicolau, A., Padua, D.A., et al.: Automatic program parallelization. Proc. IEEE 81(2), 211–243 (1993)CrossRef Banerjee, U., Eigenmann, R., Nicolau, A., Padua, D.A., et al.: Automatic program parallelization. Proc. IEEE 81(2), 211–243 (1993)CrossRef
5.
go back to reference Bik, A.J., Gannon, D.B.: Automatically exploiting implicit parallelism in Java. Concurr. Pract. Exp. 9(6), 579–619 (1997)CrossRef Bik, A.J., Gannon, D.B.: Automatically exploiting implicit parallelism in Java. Concurr. Pract. Exp. 9(6), 579–619 (1997)CrossRef
6.
go back to reference Bondhugula, U., Baskaran, M., Krishnamoorthy, S., Ramanujam, J., Rountev, A., Sadayappan, P.: Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model. In: Hendren, L. (ed.) Compiler Construction, pp. 132–146. Springer, Berlin (2008) Bondhugula, U., Baskaran, M., Krishnamoorthy, S., Ramanujam, J., Rountev, A., Sadayappan, P.: Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model. In: Hendren, L. (ed.) Compiler Construction, pp. 132–146. Springer, Berlin (2008)
7.
go back to reference Chamberlain, B., Callahan, D., Zima, H.: Parallel programmability and the chapel language. Int. J. High Perform. Comput. Appl. 21(3), 291–312 (2007)CrossRef Chamberlain, B., Callahan, D., Zima, H.: Parallel programmability and the chapel language. Int. J. High Perform. Comput. Appl. 21(3), 291–312 (2007)CrossRef
8.
go back to reference Chan, B., Abdelrahman, T.S.: Run-time support for the automatic parallelization of Java programs. J. Supercomput. 28(1), 91–117 (2004)CrossRefMATH Chan, B., Abdelrahman, T.S.: Run-time support for the automatic parallelization of Java programs. J. Supercomput. 28(1), 91–117 (2004)CrossRefMATH
9.
go back to reference Charles, P., Grothoff, C., Saraswat, V., Donawa, C., Kielstra, A., Ebcioglu, K., Von Praun, C., Sarkar, V.: X10: an object-oriented approach to non-uniform cluster computing. In ACM SIGPLAN Notices, vol. 40, pp. 519–538. ACM (2005) Charles, P., Grothoff, C., Saraswat, V., Donawa, C., Kielstra, A., Ebcioglu, K., Von Praun, C., Sarkar, V.: X10: an object-oriented approach to non-uniform cluster computing. In ACM SIGPLAN Notices, vol. 40, pp. 519–538. ACM (2005)
10.
go back to reference Chen, M.K., Olukotun, K.: The Jrpm system for dynamically parallelizing java programs. In: Proceedings of the 30th Annual International Symposium on Computer Architecture, 2003, pp. 434–445. IEEE (2003) Chen, M.K., Olukotun, K.: The Jrpm system for dynamically parallelizing java programs. In: Proceedings of the 30th Annual International Symposium on Computer Architecture, 2003, pp. 434–445. IEEE (2003)
11.
go back to reference Dagum, L., Enon, R.: Openmp: an industry standard api for shared-memory programming. IEEE Comput. Sci. Eng. 5(1), 46–55 (1998)CrossRef Dagum, L., Enon, R.: Openmp: an industry standard api for shared-memory programming. IEEE Comput. Sci. Eng. 5(1), 46–55 (1998)CrossRef
12.
go back to reference Dave, C., Bae, H., Min, S.J., Lee, S., Eigenmann, R., Midkiff, S.: Cetus: a source-to-source compiler infrastructure for multicores. Computer 12, 36–42 (2009)CrossRef Dave, C., Bae, H., Min, S.J., Lee, S., Eigenmann, R., Midkiff, S.: Cetus: a source-to-source compiler infrastructure for multicores. Computer 12, 36–42 (2009)CrossRef
14.
go back to reference Duran, A., Corbalán, J., Ayguadé, E.: An adaptive cut-off for task parallelism. In: International Conference for High Performance Computing, Networking, Storage and Analysis, 2008. SC 2008, pp. 1–11. IEEE (2008) Duran, A., Corbalán, J., Ayguadé, E.: An adaptive cut-off for task parallelism. In: International Conference for High Performance Computing, Networking, Storage and Analysis, 2008. SC 2008, pp. 1–11. IEEE (2008)
15.
go back to reference Duran, A., Corbalán, J., Ayguadé, E.: Evaluation of openmp task scheduling strategies. In: Eigenmann, R., de Supinski, B.R. (eds.) OpenMP in a New Era of Parallelism, pp. 100–110. Springer, Berlin (2008) Duran, A., Corbalán, J., Ayguadé, E.: Evaluation of openmp task scheduling strategies. In: Eigenmann, R., de Supinski, B.R. (eds.) OpenMP in a New Era of Parallelism, pp. 100–110. Springer, Berlin (2008)
16.
go back to reference Feautrier, P.: Automatic parallelization in the polytope model. In: Perrin, G.-R., Darte, A. (eds.) The Data Parallel Programming Model, pp. 79–103. Springer, Berlin (1996) Feautrier, P.: Automatic parallelization in the polytope model. In: Perrin, G.-R., Darte, A. (eds.) The Data Parallel Programming Model, pp. 79–103. Springer, Berlin (1996)
18.
go back to reference Fonseca, A., Cabral, B.: Æminiumgpu: an intelligent framework for gpu programming. In: Keller, R., Kramer, D., Weiss, J.-P. (eds.) Facing the Multicore-Challenge III, pp. 96–107. Springer, Berlin (2013) Fonseca, A., Cabral, B.: Æminiumgpu: an intelligent framework for gpu programming. In: Keller, R., Kramer, D., Weiss, J.-P. (eds.) Facing the Multicore-Challenge III, pp. 96–107. Springer, Berlin (2013)
19.
go back to reference Frigo, M., Leiserson, C.E., Randall, K.H.: The implementation of the cilk-5 multithreaded language. In: Michael Berman, A. (ed.) ACM SIGPLAN Notices, vol. 33, pp. 212–223. ACM, New York, NY (1998) Frigo, M., Leiserson, C.E., Randall, K.H.: The implementation of the cilk-5 multithreaded language. In: Michael Berman, A. (ed.) ACM SIGPLAN Notices, vol. 33, pp. 212–223. ACM, New York, NY (1998)
20.
go back to reference Hogen, G., Kindler, A., Loogen, R.: Automatic parallelization of lazy functional programs. In: ESOP’92, pp. 254–268. Springer, Berlin (1992) Hogen, G., Kindler, A., Loogen, R.: Automatic parallelization of lazy functional programs. In: ESOP’92, pp. 254–268. Springer, Berlin (1992)
21.
go back to reference Jenista, J.C., Demsky, B.C., et al.: Ooojava: software out-of-order execution. In: Gill, A. (ed.) ACM SIGPLAN Notices, vol. 46, pp. 57–68. ACM, New York, NY (2011) Jenista, J.C., Demsky, B.C., et al.: Ooojava: software out-of-order execution. In: Gill, A. (ed.) ACM SIGPLAN Notices, vol. 46, pp. 57–68. ACM, New York, NY (2011)
22.
go back to reference Lea, D.: A Java fork/join framework. In: Proceedings of the ACM 2000 Conference on Java Grande, pp. 36–43. ACM (2000) Lea, D.: A Java fork/join framework. In: Proceedings of the ACM 2000 Conference on Java Grande, pp. 36–43. ACM (2000)
23.
go back to reference Lee, S., Min, S.J., Eigenmann, R.: OpenMP to GPGPU: a compiler framework for automatic translation and optimization. ACM SIGPLAN Not. 44(4), 101–110 (2009)CrossRef Lee, S., Min, S.J., Eigenmann, R.: OpenMP to GPGPU: a compiler framework for automatic translation and optimization. ACM SIGPLAN Not. 44(4), 101–110 (2009)CrossRef
24.
go back to reference Leino, K., Poetzsch-Heffter, A., Zhou, Y.: Using data groups to specify and check side effects. ACM SIGPLAN Not. 37(5), 246–257 (2002)CrossRef Leino, K., Poetzsch-Heffter, A., Zhou, Y.: Using data groups to specify and check side effects. ACM SIGPLAN Not. 37(5), 246–257 (2002)CrossRef
25.
go back to reference Marlow, S., Peyton Jones, S., Singh, S.: Runtime support for multicore haskell. In: Tolmach, A. (ed.) ACM SIGPLAN Notices, vol. 44, pp. 65–78. ACM, New York, NY (2009) Marlow, S., Peyton Jones, S., Singh, S.: Runtime support for multicore haskell. In: Tolmach, A. (ed.) ACM SIGPLAN Notices, vol. 44, pp. 65–78. ACM, New York, NY (2009)
27.
go back to reference Rafael, J., Correia, I., Fonseca, A., Cabral, B.: Dependency-based automatic parallelization of java applications. In: Euro-Par 2014: Parallel Processing Workshops, pp. 182–193. Springer, Berlin (2014) Rafael, J., Correia, I., Fonseca, A., Cabral, B.: Dependency-based automatic parallelization of java applications. In: Euro-Par 2014: Parallel Processing Workshops, pp. 182–193. Springer, Berlin (2014)
28.
go back to reference Senghor, A., Konate, K.: Fjcomp, a Java parallelizing compiler for dealing with divide-and-conquer algorithm. In: 2013 International Conference on Computer Applications Technology (ICCAT), pp. 1–5. IEEE (2013) Senghor, A., Konate, K.: Fjcomp, a Java parallelizing compiler for dealing with divide-and-conquer algorithm. In: 2013 International Conference on Computer Applications Technology (ICCAT), pp. 1–5. IEEE (2013)
29.
go back to reference Steele, G.: Parallel programming and parallel abstractions in fortress. Lect. Not. Comput. Sci. 3945, 1 (2006)CrossRef Steele, G.: Parallel programming and parallel abstractions in fortress. Lect. Not. Comput. Sci. 3945, 1 (2006)CrossRef
30.
go back to reference Stork, S., Naden, K., Sunshine, J., Mohr, M., Fonseca, A., Marques, P., Aldrich, J.: Æminium: a permission-based concurrent-by-default programming language approach. ACM Trans. Program. Lang. Syst. 36(1), 2 (2014)CrossRef Stork, S., Naden, K., Sunshine, J., Mohr, M., Fonseca, A., Marques, P., Aldrich, J.: Æminium: a permission-based concurrent-by-default programming language approach. ACM Trans. Program. Lang. Syst. 36(1), 2 (2014)CrossRef
31.
go back to reference Swaine, J., Tew, K., Dinda, P., Findler, R.B., Flatt, M.: Back to the futures: incremental parallelization of existing sequential runtime systems. In: Rinard, M., Sullivan, K.J., Steinberg, D.H. (eds.) ACM SIGPLAN Notices, vol. 45, pp. 583–597. ACM, New York, NY (2010) Swaine, J., Tew, K., Dinda, P., Findler, R.B., Flatt, M.: Back to the futures: incremental parallelization of existing sequential runtime systems. In: Rinard, M., Sullivan, K.J., Steinberg, D.H. (eds.) ACM SIGPLAN Notices, vol. 45, pp. 583–597. ACM, New York, NY (2010)
32.
go back to reference Tzannes, A., Caragea, G.C., Barua, R., Vishkin, U.: Lazy binary-splitting: a run-time adaptive work-stealing scheduler. In: Hall, M. (ed.) ACM SIGPLAN Notices, vol. 45, pp. 179–190. ACM, New York, NY (2010) Tzannes, A., Caragea, G.C., Barua, R., Vishkin, U.: Lazy binary-splitting: a run-time adaptive work-stealing scheduler. In: Hall, M. (ed.) ACM SIGPLAN Notices, vol. 45, pp. 179–190. ACM, New York, NY (2010)
33.
go back to reference Wang, C., Li, X., Zhang, J., Zhou, X., Nie, X.: Mp-tomasulo: a dependency-aware automatic parallel execution engine for sequential programs. ACM Trans. Archit. Code Optim. 10(2), 9 (2013)CrossRef Wang, C., Li, X., Zhang, J., Zhou, X., Nie, X.: Mp-tomasulo: a dependency-aware automatic parallel execution engine for sequential programs. ACM Trans. Archit. Code Optim. 10(2), 9 (2013)CrossRef
34.
go back to reference Zhao, J., Rogers, I., Kirkham, C., Watson, I.: Loop parallelisation for the jikes rvm. In: Sixth International Conference on Parallel and Distributed Computing, Applications and Technologies, 2005. PDCAT 2005, pp. 35–39. IEEE (2005) Zhao, J., Rogers, I., Kirkham, C., Watson, I.: Loop parallelisation for the jikes rvm. In: Sixth International Conference on Parallel and Distributed Computing, Applications and Technologies, 2005. PDCAT 2005, pp. 35–39. IEEE (2005)
Metadata
Title
Automatic Parallelization: Executing Sequential Programs on a Task-Based Parallel Runtime
Authors
Alcides Fonseca
Bruno Cabral
João Rafael
Ivo Correia
Publication date
01-12-2016
Publisher
Springer US
Published in
International Journal of Parallel Programming / Issue 6/2016
Print ISSN: 0885-7458
Electronic ISSN: 1573-7640
DOI
https://doi.org/10.1007/s10766-016-0426-5

Other articles of this Issue 6/2016

International Journal of Parallel Programming 6/2016 Go to the issue

Premium Partner