nach oben

International Journal of Parallel Programming

Erschienen in:

30.03.2016

Autonomic Coordination of Skeleton-Based Applications Over CPU/GPU Multi-Core Architectures

verfasst von: Mehdi Goli, Horacio González–Vélez

Erschienen in: International Journal of Parallel Programming | Ausgabe 2/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Widely adumbrated as patterns of parallel computation and communication, algorithmic skeletons introduce a viable solution for efficiently programming modern heterogeneous multi-core architectures equipped not only with traditional multi-core CPUs, but also with one or more programmable Graphics Processing Units (GPUs). By systematically applying algorithmic skeletons to address complex programming tasks, it is arguably possible to separate the coordination from the computation in a parallel program, and therefore subdivide a complex program into building blocks (modules, skids, or components) that can be independently created and then used in different systems to drive multiple functionalities. By exploiting such systematic division, it is feasible to automate coordination by addressing extra-functional and non-functional features such as application performance, portability, and resource utilisation from the component level in heterogeneous multi-core architectures. In this paper, we introduce a novel approach to exploit the inherent features of skeleton-based applications in order to automatically coordinate them over heterogeneous (CPU/GPU) multi-core architectures and improve their performance. Our systematic evaluation demonstrates up to one order of magnitude speed-up on heterogeneous multi-core architectures.

Vorheriger Artikel Guest Editorial: High-Level Parallel Programming and Applications

Nächster Artikel Using the Xeon Phi Platform to Run Speculatively-Parallelized Codes

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

It is noted that PEI is an open source framework located at https://github.com/mehdi-goli/MC-FastFlow-PEI.

Aldinucci, M., Danelutto, M., Kilpatrick, P.: Management in distributed systems: a semi-formal approach. In: Euro-Par 2007, LNCS, vol 4641, Springer, Rennes, pp 651–661 (2007)

Aldinucci, M., Campa, S., Danelutto, M., Vanneschi, M.: Behavioural skeletons in GCM: autonomic management of grid components. In: PDP 2008, IEEE, Toulouse, pp 54–63 (2008)

Aldinucci, M., Danelutto, M., Zoppi, G., Kilpatrick, P.: Advances in autonomic components & services. In: From grids to service and pervasive computing, Springer, pp 3–17 (2008b)

Aldinucci, M., Danelutto, M., Kilpatrick, P.: Towards hierarchical management of autonomic components: a case study. In: PDP 2009, IEEE, Weimar, pp 3–10 (2009)

Aldinucci, M., Danelutto, M., Kilpatrick, P., Torquati, M.: FastFlow: high-level and efficient streaming on multi-core. In: Pllana, S., Xhafa, F. (eds), Programming Multi-core and Many-core Computing Systems, no. 66 in Wiley Series on Parallel and Distributed Computing, Wiley (2014)

Bharadwaj, V., Ghose, D., Robertazzi, T.G.: Divisible load theory: a new paradigm for load scheduling in distributed systems. Cluster Comput. 6(1), 7–17 (2003)CrossRef

Campa, S., Danelutto, M., Goli, M., González-Vélez, H., Popescu, A.M., Torquati, M.: Parallel patterns for heterogeneous CPU/GPU architectures: structured parallelism from cluster to cloud. Future Gener. Comput. Syst. 37, 354–366 (2014)CrossRef

Clint Whaley, R., Petitet, A., Dongarra, J.J.: Automated empirical optimizations of software and the atlas project. Parallel Comput. 27(1), 3–35 (2001)CrossRefMATH

Cole, M.: Algorithmic skeletons: structured management of parallel computation. Research monographs in parallel & distributed computing. Pitman/MIT Press, London (1989)MATH

10.

Danelutto, M., Torquati, M.: Structured parallel programming with “core” FastFlow. In: CEFP2013 5th Summer School (revised selected papers), Springer, Cluj-Napoca, LNCS, vol 8606, pp 29–75 (2013)

11.

Danelutto, M., Zoppi, G.: Behavioural skeletons meeting services. In: ICCS 2008, LNCS, vol 5101, Springer, Kraków, pp 146–153 (2008)

12.

Donadio, S., Brodman, J., Roeder, T., Yotov, K., Barthou, D., Cohen, A., Garzarán, M.J., Padua, D., Pingali, K.: A language for the compact representation of multiple programversions. In: LCPC 2005, LNCS, vol 4339, Springer, Hawthorne, pp 136–151 (2006)

13.

Enmyren, J., Kessler, C.W.: SkePU: a multi-backend skeleton programming library for multi-GPU systems. In: Proceedings of the fourth international workshop on High-level parallel programming and applications, ACM, pp 5–14 (2010)

14.

Ernsting, S., Kuchen, H.: Algorithmic skeletons for multi-core, multi-GPU systems and clusters. Int. J. High Perform. Comput. Network. 7(2), 129–138 (2012)CrossRef

15.

Fialka, O., Cadik, M.: FFT and convolution performance in image filtering on GPU. CIV2006, pp. 609–614. IEEE, London (2006)

16.

Frigo, M., Johnson, S.G.: FFTW: An adaptive software architecture for the FFT. In: ICSSP’98, IEEE, Seattle, vol 3, pp 1381–1384 (1998)

17.

Goli, M.: Autonomic behavioural framework for structural parallelism over heterogeneous multi-core systems. PhD thesis, Robert Gordon University, Aberdeen, UK, http://hdl.handle.net/10059/1373 (2015)

18.

Goli, M., González-Vélez, H.: Heterogeneous Algorithmic Skeletons for Fast Flowwith Seamless Coordination over Hybrid Architectures. In: PDP 2013, IEEE, Belfast,pp 148–156 (2013)

19.

Goli, M., González-Vélez, H.: N-body computations using skeletal frameworks on multicore cpu/graphics processing unit architectures: an empirical performance evaluation. Concurr. Comput. 26(4), 972–986 (2014)CrossRef

20.

Goli, M., McCall, J., Brown, C., Janjic, V., Hammond, K.: Mapping parallel programs to heterogeneous CPU/GPU architectures using a Monte Carlo tree search. In: CEC2013, IEEE, Cancun, pp 2932–2939 (2013)

21.

González-Vélez, H., Leyton, M.: A survey of algorithmic skeleton frameworks: high-level structured parallel programming enablers. Softw. Pract. Exp. 40(12), 1135–1160 (2010)CrossRef

22.

Hammond, K., Aldinucci, M., Brown, C., Cesarini, F., Danelutto, M.,González-Vélez, H., Kilpatrick, P., Keller, R., Rossbory, M., Shainer, G.: The ParaPhrase project: Parallel patterns for adaptive heterogeneous multicore systems. In: FMCO 2011- Revised Selected Papers, Springer, Turin, LNCS, vol 7542, pp 218–236 (2011)

23.

Hintjens, P.: ZeroMQ: Messaging for Many Applications. O’Reilly Media, Inc, Sebastopol (2013)

24.

Hwu, WmW: GPU Computing Gems Jade Edition. Morgan Kaufmann, (2011)

25.

Katagiri, T., Kise, K., Honda, H., Yuba, T.: Fiber: A generalized framework for auto-tuning software. In: High Performance Computing, Springer, pp 146–159 (2003)

26.

Kephart, J.O., Chess, D.M.: The vision of autonomic computing. Computer 36(1), 41–50 (2003)MathSciNetCrossRef

27.

Owens, J.D., Luebke, D., Govindaraju, N., Harris, M., Krger, J., Lefohn, A.E., Purcell, T.J.: A survey of general-purpose computation on graphics hardware. Comput. Graph. Forum 26(1), 80–113 (2007)CrossRef

28.

Pancake, C.M.: Performance tools for today’s HPC: Are we addressing the right issues? Parallel Comput. 27(11), 1403–1415 (2001)CrossRefMATH

29.

Puschel, M., Moura, J.M., Johnson, J.R., Padua, D., Veloso, M.M., Singer, B.W., Xiong, J., Franchetti, F., Gacic, A., Voronenko, Y., et al.: SPIRAL: Code generation for DSP transforms. Proc. IEEE 93(2), 232–275 (2005)CrossRef

30.

Schaefer, C.A., Pankratius, V., Tichy, W.F.: Atune-il: An instrumentation language for auto-tuning parallel applications. In: EuroPar 2009, Springer, Delft, LNCS, vol 5704, pp 9–20 (2009)

31.

Steuwer, M., Kegel, P., Gorlatch, S.: SkelCL-a portable skeleton library for high-level GPU programming. In: IPDPS 2011, IEEE, Anchorage, pp 1176–1182 (2011)

Titel: Autonomic Coordination of Skeleton-Based Applications Over CPU/GPU Multi-Core Architectures
verfasst von: Mehdi Goli
Horacio González–Vélez
Publikationsdatum: 30.03.2016
Verlag: Springer US
Erschienen in: International Journal of Parallel Programming / Ausgabe 2/2017
Print ISSN: 0885-7458
Elektronische ISSN: 1573-7640
DOI: https://doi.org/10.1007/s10766-016-0419-4

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 2/2017

Automatic CPU/GPU Generation of Multi-versioned OpenCL Kernels for C++ Scientific Applications

Towards Systematic Parallelization of Graph Transformations Over Pregel

Data Parallel Algorithmic Skeletons with Accelerator Support

Using the Xeon Phi Platform to Run Speculatively-Parallelized Codes

Functional Models of Hadoop MapReduce with Application to Scan

Multi-ML: Programming Multi-BSP Algorithms in ML

Premium Partner