Top

International Journal of Parallel Programming

Published in:

01-12-2016

Automatic Parallelization: Executing Sequential Programs on a Task-Based Parallel Runtime

Authors: Alcides Fonseca, Bruno Cabral, João Rafael, Ivo Correia

Published in: International Journal of Parallel Programming | Issue 6/2016

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

There are billions of lines of sequential code inside nowadays’ software which do not benefit from the parallelism available in modern multicore architectures. Automatically parallelizing sequential code, to promote an efficient use of the available parallelism, has been a research goal for some time now. This work proposes a new approach for achieving such goal. We created a new parallelizing compiler that analyses the read and write instructions, and control-flow modifications in programs to identify a set of dependencies between the instructions in the program. Afterwards, the compiler, based on the generated dependencies graph, rewrites and organizes the program in a task-oriented structure. Parallel tasks are composed by instructions that cannot be executed in parallel. A work-stealing-based parallel runtime is responsible for scheduling and managing the granularity of the generated tasks. Furthermore, a compile-time granularity control mechanism also avoids creating unnecessary data-structures. This work focuses on the Java language, but the techniques are general enough to be applied to other programming languages. We have evaluated our approach on 8 benchmark programs against OoOJava, achieving higher speedups. In some cases, values were close to those of a manual parallelization. The resulting parallel code also has the advantage of being readable and easily configured to improve further its performance manually.

previous article A Parallelization Approach for Hard Real-Time Systems and Its Application on Two Industrial Programs

next article Purge-Rehab: Eager Software Transactional Memory with High Performance Under Contention

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Amini, M., Creusillet, B., Even, S., Keryell, R., Goubier, O., Guelton, S., McMahon, J.O., Pasquier, F.X., Péan, G., Villalon, P.: Par4all: from convex array regions to heterogeneous computing. In: IMPACT 2012: Second International Workshop on Polyhedral Compilation Techniques HiPEAC 2012 (2012)

Ayguadé, E., Copty, N., Duran, A., Hoeflinger, J., Lin, Y., Massaioli, F., Teruel, X., Unnikrishnan, P., Zhang, G.: The design of openmp tasks. IEEE Trans. Parallel Distrib. Syst. 20(3), 404–418 (2009)CrossRef

Ayguadé, E., Duran, A., Hoeflinger, J., Massaioli, F., Teruel, X.: An experimental evaluation of the new openmp tasking model. In: Adve, V., Garzarán, M.J., Petersen, P. (eds.) Languages and Compilers for Parallel Computing, pp. 63–77. Springer, Berlin (2008)

Banerjee, U., Eigenmann, R., Nicolau, A., Padua, D.A., et al.: Automatic program parallelization. Proc. IEEE 81(2), 211–243 (1993)CrossRef

Bik, A.J., Gannon, D.B.: Automatically exploiting implicit parallelism in Java. Concurr. Pract. Exp. 9(6), 579–619 (1997)CrossRef

Bondhugula, U., Baskaran, M., Krishnamoorthy, S., Ramanujam, J., Rountev, A., Sadayappan, P.: Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model. In: Hendren, L. (ed.) Compiler Construction, pp. 132–146. Springer, Berlin (2008)

Chamberlain, B., Callahan, D., Zima, H.: Parallel programmability and the chapel language. Int. J. High Perform. Comput. Appl. 21(3), 291–312 (2007)CrossRef

Chan, B., Abdelrahman, T.S.: Run-time support for the automatic parallelization of Java programs. J. Supercomput. 28(1), 91–117 (2004)CrossRefMATH

Charles, P., Grothoff, C., Saraswat, V., Donawa, C., Kielstra, A., Ebcioglu, K., Von Praun, C., Sarkar, V.: X10: an object-oriented approach to non-uniform cluster computing. In ACM SIGPLAN Notices, vol. 40, pp. 519–538. ACM (2005)

10.

Chen, M.K., Olukotun, K.: The Jrpm system for dynamically parallelizing java programs. In: Proceedings of the 30th Annual International Symposium on Computer Architecture, 2003, pp. 434–445. IEEE (2003)

11.

Dagum, L., Enon, R.: Openmp: an industry standard api for shared-memory programming. IEEE Comput. Sci. Eng. 5(1), 46–55 (1998)CrossRef

12.

Dave, C., Bae, H., Min, S.J., Lee, S., Eigenmann, R., Midkiff, S.: Cetus: a source-to-source compiler infrastructure for multicores. Computer 12, 36–42 (2009)CrossRef

13.

Dominguez, R.M.: Evaluating Different Java Bindings for OpenCL. Master thesis at Universidad Carlos III de Madrid. https://e-archivo.uc3m.es/handle/10016/17183 (2013)

14.

Duran, A., Corbalán, J., Ayguadé, E.: An adaptive cut-off for task parallelism. In: International Conference for High Performance Computing, Networking, Storage and Analysis, 2008. SC 2008, pp. 1–11. IEEE (2008)

15.

Duran, A., Corbalán, J., Ayguadé, E.: Evaluation of openmp task scheduling strategies. In: Eigenmann, R., de Supinski, B.R. (eds.) OpenMP in a New Era of Parallelism, pp. 100–110. Springer, Berlin (2008)

16.

Feautrier, P.: Automatic parallelization in the polytope model. In: Perrin, G.-R., Darte, A. (eds.) The Data Parallel Programming Model, pp. 79–103. Springer, Berlin (1996)

17.

Fonseca, A.: Æminium Benchmark Suite. https://github.com/AEminium/AeminiumBenchmarks (2013). Accessed 23 Oct 2013

18.

Fonseca, A., Cabral, B.: Æminiumgpu: an intelligent framework for gpu programming. In: Keller, R., Kramer, D., Weiss, J.-P. (eds.) Facing the Multicore-Challenge III, pp. 96–107. Springer, Berlin (2013)

19.

Frigo, M., Leiserson, C.E., Randall, K.H.: The implementation of the cilk-5 multithreaded language. In: Michael Berman, A. (ed.) ACM SIGPLAN Notices, vol. 33, pp. 212–223. ACM, New York, NY (1998)

20.

Hogen, G., Kindler, A., Loogen, R.: Automatic parallelization of lazy functional programs. In: ESOP’92, pp. 254–268. Springer, Berlin (1992)

21.

Jenista, J.C., Demsky, B.C., et al.: Ooojava: software out-of-order execution. In: Gill, A. (ed.) ACM SIGPLAN Notices, vol. 46, pp. 57–68. ACM, New York, NY (2011)

22.

Lea, D.: A Java fork/join framework. In: Proceedings of the ACM 2000 Conference on Java Grande, pp. 36–43. ACM (2000)

23.

Lee, S., Min, S.J., Eigenmann, R.: OpenMP to GPGPU: a compiler framework for automatic translation and optimization. ACM SIGPLAN Not. 44(4), 101–110 (2009)CrossRef

24.

Leino, K., Poetzsch-Heffter, A., Zhou, Y.: Using data groups to specify and check side effects. ACM SIGPLAN Not. 37(5), 246–257 (2002)CrossRef

25.

Marlow, S., Peyton Jones, S., Singh, S.: Runtime support for multicore haskell. In: Tolmach, A. (ed.) ACM SIGPLAN Notices, vol. 44, pp. 65–78. ACM, New York, NY (2009)

26.

Pawlak, R., Monperrus, M., Petitprez, N., Noguera, C., Seinturier, L.: Spoon v2: Large scale source code analysis and transformation for Java. Tech. Rep. hal-01078532, INRIA (2006). https://hal.inria.fr/hal-01078532

27.

Rafael, J., Correia, I., Fonseca, A., Cabral, B.: Dependency-based automatic parallelization of java applications. In: Euro-Par 2014: Parallel Processing Workshops, pp. 182–193. Springer, Berlin (2014)

28.

Senghor, A., Konate, K.: Fjcomp, a Java parallelizing compiler for dealing with divide-and-conquer algorithm. In: 2013 International Conference on Computer Applications Technology (ICCAT), pp. 1–5. IEEE (2013)

29.

Steele, G.: Parallel programming and parallel abstractions in fortress. Lect. Not. Comput. Sci. 3945, 1 (2006)CrossRef

30.

Stork, S., Naden, K., Sunshine, J., Mohr, M., Fonseca, A., Marques, P., Aldrich, J.: Æminium: a permission-based concurrent-by-default programming language approach. ACM Trans. Program. Lang. Syst. 36(1), 2 (2014)CrossRef

31.

Swaine, J., Tew, K., Dinda, P., Findler, R.B., Flatt, M.: Back to the futures: incremental parallelization of existing sequential runtime systems. In: Rinard, M., Sullivan, K.J., Steinberg, D.H. (eds.) ACM SIGPLAN Notices, vol. 45, pp. 583–597. ACM, New York, NY (2010)

32.

Tzannes, A., Caragea, G.C., Barua, R., Vishkin, U.: Lazy binary-splitting: a run-time adaptive work-stealing scheduler. In: Hall, M. (ed.) ACM SIGPLAN Notices, vol. 45, pp. 179–190. ACM, New York, NY (2010)

33.

Wang, C., Li, X., Zhang, J., Zhou, X., Nie, X.: Mp-tomasulo: a dependency-aware automatic parallel execution engine for sequential programs. ACM Trans. Archit. Code Optim. 10(2), 9 (2013)CrossRef

34.

Zhao, J., Rogers, I., Kirkham, C., Watson, I.: Loop parallelisation for the jikes rvm. In: Sixth International Conference on Parallel and Distributed Computing, Applications and Technologies, 2005. PDCAT 2005, pp. 35–39. IEEE (2005)

Title: Automatic Parallelization: Executing Sequential Programs on a Task-Based Parallel Runtime
Authors: Alcides Fonseca
Bruno Cabral
João Rafael
Ivo Correia
Publication date: 01-12-2016
Publisher: Springer US
Published in: International Journal of Parallel Programming / Issue 6/2016
Print ISSN: 0885-7458
Electronic ISSN: 1573-7640
DOI: https://doi.org/10.1007/s10766-016-0426-5

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Other articles of this Issue 6/2016

Parallel Implementations of the Cooperative Particle Swarm Optimization on Many-core and Multi-core Architectures

Purge-Rehab: Eager Software Transactional Memory with High Performance Under Contention

Preface to the Special Issue on Sequential Code Parallelization

A Cross-ISA Kernelized High-Performance Parallel Emulator

A Parallel Yet Pipelined Architecture for Efficient Implementation of the Advanced Encryption Standard Algorithm on Reconfigurable Hardware

Transparent Speculative Parallelization of Discrete Event Simulation Applications Using Global Variables

Premium Partner