Top

The Journal of Supercomputing

Published in:

01-11-2012

Paradigmatic shifts for exascale supercomputing

Authors: Neal E. Davis, Robert W. Robey, Charles R. Ferenbaugh, David Nicholaeff, Dennis P. Trujillo

Published in: The Journal of Supercomputing | Issue 2/2012

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

As the next generation of supercomputers reaches the exascale, the dominant design parameter governing performance will shift from hardware to software. Intelligent usage of memory access, vectorization, and intranode threading will become critical to the performance of scientific applications and numerical calculations on exascale supercomputers. Although challenges remain in effectively programming the heterogeneous devices likely to be utilized in future supercomputers, new languages and tools are providing a pathway for application developers to tackle this new frontier. These languages include open programming standards such as OpenCL and OpenACC, as well as widely-adopted languages such as CUDA; also of importance are high-quality libraries such as CUDPP and Thrust. This article surveys a purposely diverse set of proof-of-concept applications developed at Los Alamos National Laboratory. We find that the capability level of the accelerator computing hardware and languages has moved beyond the regular grid finite difference calculations and molecular dynamics codes. More advanced applications requiring dynamic memory allocation, such as cell-based adaptive mesh refinement, can now be addressed—and with more effort even unstructured mesh codes can be moved to the GPU.

previous article Optimal resource provisioning for cloud computing environment

next article Algorithms and architectures for 2D discrete wavelet transform

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Bergen BK, Daniels MG, Weber PM (2010) A hybrid programming model for compressible gas dynamics using OpenCL. In: 2010 39th international conference on parallel processing workshops. doi:10.1109/ICPPW.2010.60

Bhatele A (2010) Automating topology aware mapping for supercomputers. Dissertation, University of Illinois at Urbana–Champaign

Boillat J, Burkhart H, Decker K, Kropf P (1991) Parallel computing in the 1990’s: attacking the software problem. Phys Rep 207(3–5):141–165 CrossRef

Bowers KJ, Albright J, Bergen B, Yin L, Barker J, Kerbyson DJ (2008) 0.374 Pflop/s trillion-particle kinetic modeling of laser plasma interaction on Roadrunner. In: Proceedings of the 2008 ACM/IEEE conference on supercomputing, SC ’08. IEEE, Piscataway, pp 63:1–63:11

Bowers KJ, Albright BJ, Yin L, Bergen B, Twan T (2008) Ultrahigh performance three-dimensional electromagnetic relativistic kinetic plasma simulation. Phys Plasmas 15(5):055703. doi:10.1063/1.2840133 CrossRef

Casaglia G (1976) Distributed computing systems: a biased review. Euromicro Newsl 2(4):5–18 CrossRef

Chen S, Gibbons B, Nath S (2011) Rethinking database algorithms for phase change memory. In: Proceedings of the 5th biennial conference on innovative data systems research (CIDR’11)

Davis SF (1987) A simplified TVD finite difference scheme via artificial viscosity. SIAM J Sci Comput 8(1):1–18. doi:10.1137/0908002 MATHCrossRef

DeVito Z, Joubert N, Palacios F, Oakley S, Medina M, Barrientos M, Elsen E, Ham F, Aiken A, Duraisamy K, Darve E, Alonso J, Hanrahan P (2011) Liszt: a domain specific language for building portable mesh-based PDE solvers. In: Proceedings of the 2011 ACM/IEEE conference on supercomputing

10.

Dongarra J (2009) An overview of HPC and challenges for the future. In: HPC Asia 2009. http://www.nchc.org.tw/en/news/index.php?NEWS_ID=49. Accessed 29 July 2011

11.

Feng W, Cameron K (2007) The Green500 list: encouraging sustainable supercomputing. Computer 40(12):50–55 CrossRef

12.

Ferenbaugh C (in review) A comparison of GPU strategies for unstructured mesh physics. Concurr Comput Pract Exp

13.

Gropp W, Lusk E, Doss N, Skjellum A (1996) A high-performance, portable implementation of the MPI message passing interface standard. Parallel Comput 22(6):789–828 MATHCrossRef

14.

Harvey M, Fabritiis GD (2011) Swan: a tool for porting CUDA programs to OpenCL. Comput Phys Commun 182(4):1093–1099 CrossRef

15.

Kato S, Lakshmanan K, Kumar A, Kelkar M, Ishikawa Y, Rajkumar R (2011) RGEM: a responsive GPGPU execution model for runtime engines. In: 2011 IEEE 32nd real-time systems symposium (RTSS), pp 57–66 CrossRef

16.

Kato S, McThrow M, Maltzahn C, Brandt S (in press) Gdev: first-class GPU resource management in the operating system. In: 2012 USENIX annual technical conference (USENIX ATC’12)

17.

Khaleel MA (2010) 2010 exascale workshop panel report meeting. Technical report PNNL-19515, Pacific Northwest National Laboratory, Department of Energy, Washington, DC

18.

Klimovitski A (2001) Using SSE and SSE2: misconceptions and reality. In: Intel developer update magazine, March 2001, pp 1–8

19.

Kogge P, Bergman K, Borkar S, Campbell D, Carlson W, Dally W, Denneau M, Franzon P, Harrod W, Hill K, Hiller J, Karp S, Keckler S, Klein D, Lucas R, Richards M, Scarpelli A, Scott S, Snavely A, Sterling T, Williams RS, Yelick K (2008) Exascale computing study: Technology challenges in achieving exascale systems. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.165.6676. Accessed 23 March 2012

20.

Los Alamos National Laboratory (2011) Flag 3.2 alpha 5 radiation–hydrodynamics code (LA-CC 11-065)

21.

Message Passing Interface Forum (1994) MPI: a message-passing interface standard. Int J Supercomput Appl High Perform Comput 8(3–4):159–416

22.

Mills A, Wood L (1981) Cray-1: a powerful delivery system for engineering software. Adv Eng Softw 3(2):62–66 CrossRef

23.

Nicholaeff D, Davis N, Trujillo D Robey, R (in review) A cell-based adaptive mesh refinement implemented with general-purpose graphics processing units. SIAM J Sci Comput

24.

Oyanagi Y (2002) Future of supercomputing. J Comput Appl Math 149(1):147–153 MathSciNetMATHCrossRef

25.

Pao K (2011) Co-design and you: why should mathematicians care about exascale computing. In: 2011 DOE applied mathematics program meeting

26.

Papadrakakis M, Stavroulakis G, Karatarakis A (2011) A new era in scientific computing: domain decomposition methods in hybrid CPU–GPU architectures. Comput Methods Appl Mech Eng 200(13–16):1490–1508 MathSciNetMATHCrossRef

27.

Robey RN, Nicholaeff D Robey, RW (in review) Hash-based algorithms for discretized data. SIAM J Sci Comput

28.

Simon H, Zacharia T, Stevens R (2007) Modeling and simulation at the exascale for energy and the environment. Technical report, Department of Energy, Washington, DC

29.

Snir M, Gropp W, Kogge P (2011) Exascale research: preparing for the post-Moore era. http://hdl.handle.net/2142/25469. Accessed 25 July 2011

30.

Sottile M, Rasmussen C, Weseloh W, Robey R, Quinlan D, Overbey, J (in press) ForOpenCL: transformations exploiting array syntax in Fortran for accelerator programming. Int J Comp Sci Eng

31.

Sottile MJ, Rasmussen CE, Weseloh WN, Robey RW, Quinlan J, Overbey J (2011) ForOpenCL: transformations exploiting array syntax in fortran for accelerator programming. CoRR abs/1107.2157

32.

Tendler J, Dodson JS, Fields S, Le H, Sinharoy B (2002) POWER4 system microarchitecture. IBM J Res Dev 46(1):5–25 CrossRef

33.

Wolfe M (2008) How we should program GPGPUs. Linux Journal, November 2008. http://www.linuxjournal.com/magazine/how-we-should-program-gpgpus. Accessed 29 July 2011

34.

Yang XJ, Liao XK, Lu K, Hu QF, Song JQ, Su JS (2011) The TianHe-1A supercomputer: its hardware and software. J Comput Sci Technol 26(3):344–351 CrossRef

35.

Young J (2011) Supercomputers let up on speed. The Chronicle of Higher Education, April 2011. http://chronicle.com/article/In-University-Supercomputing/126979/. Accessed 20 July 2011

36.

Zhang C, Yuan X, Srinivasan A (2010) Processor affinity and MPI performance on SMP–CMP clusters. In: 2010 IEEE international symposium on parallel distributed processing, workshops and PhD forum (IPDPSW), pp 1–8. doi:10.1109/IPDPSW.2010.5470774 CrossRef

Title: Paradigmatic shifts for exascale supercomputing
Authors: Neal E. Davis
Robert W. Robey
Charles R. Ferenbaugh
David Nicholaeff
Dennis P. Trujillo
Publication date: 01-11-2012
Publisher: Springer US
Published in: The Journal of Supercomputing / Issue 2/2012
Print ISSN: 0920-8542
Electronic ISSN: 1573-0484
DOI: https://doi.org/10.1007/s11227-012-0789-3

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Other articles of this Issue 2/2012

Replication based fault tolerant job scheduling strategy for economy driven grid

Traffic load analysis and its application to enhancing longevity on IEEE 802.15.4/ZigBee Sensor Network

White-light interferometric method for secure key distribution

Hierarchical parallelization and optimization of high-order stencil computations on multicore clusters

Optical multiplexing techniques for photonic Clos networks in High Performance Computing Architectures

Optical spatial image processor based on aliasing of pseudo-periodic sampling

Premium Partner