Abstract
Graphic Processing Units (GPUs) have greatly exceeded their initial role of graphics accelerators and have taken a new role of co-processors for computation—heavy tasks. Both hardware and software ecosystems have now matured, with fully IEEE compliant double precision and memory correction being supported and a rich set of software tools and libraries being available. This in turn has lead to their increased adoption in a growing number of fields, both in academia and, more recently, in industry. In this review we investigate the adoption of GPUs as accelerators in the field of Finite Element Structural Analysis, a design tool that is now essential in many branches of engineering. We survey the work that has been done in accelerating the most time consuming steps of the analysis, indicate the speedup that has been achieved and, where available, highlight software libraries and packages that will enable the reader to take advantage of such acceleration. Overall, we try to draw a high level picture of where the state of the art is currently at.
Similar content being viewed by others
References
Acceleware http://www.acceleware.com/matrix-solvers
Anzt H, Tomov S, Gates M, Dongarra J, Heuveline V (2012) Block-asynchronous multigrid smoothers for GPU-accelerated systems. Proc Comput Sci 9:7–16
BCSLIB-EXT http://www.aanalytics.com/products.htm
Bell N, Garland M (2009) Implementing sparse matrix-vector multiplication on throughput-oriented processors. In: Proceedings of the conference on high performance computing networking, storage and analysis. ACM, New York, pp 1–11
Bolz J, Farmer I, Grinspun E, Schrooder P (2003) Sparse matrix solvers on the GPU: conjugate gradients and multigrid. In: ACM SIGGRAPH 2003 papers. ACM, New York, pp 917–924
Botsch M, Bommes D, Vogel C, Kobbelt L (2004) GPU-based tolerance volumes for mesh processing. In: Proceedings of the 12th pacific conference on computer graphics and applications, pp 237–243
Buatois L, Caumon G, Lévy B (2007) Concurrent number cruncher: an efficient sparse linear solver on the GPU. In: High performance computing and communications, pp 358–371
Cecka C, Lew A, Darve E (2011) Assembly of finite element methods on graphics processors. Int J Numer Methods Eng 85(5):640–669
Cevahir A, Nukada A, Matsuoka S (2009) Fast conjugate gradients with multiple GPUs. In: Computational science, ICCS 2009, pp 893–903
Cevahir A, Nukada A, Matsuoka S (2010) High performance conjugate gradient solver on multi-GPU clusters using hypergraph partitioning. Comput Sci Res Dev 25(1):83–91
Choi J, Singh A, Vuduc R (2010) Model-driven autotuning of sparse matrix-vector multiply on GPUs. In: Proceedings of the 15th ACM SIGPLAN symposium on principles and practice of parallel computing. ACM, New York, pp 115–126
Crivelli L, Dunbar M (2012) Evolving use of GPU for Dassault Systemes simulation products. In: GPU technology conference, GTC 2012
CULA Sparse http://www.culatools.com/sparse/
DeCoro C, Tatarchuk N (2007) Real-time mesh simplification using the GPU. In: Proceedings of the 2007 symposium on interactive 3D graphics and games, pp 161–166
Dehnavi MM, Fernandez D, Gaudiot JL, Giannacopoulos D (2012) Parallel sparse approximate inverse preconditioning on graphic processing units. IEEE Trans Parallel Distrib Syst 99:1
Filipovic J, Peterlik I, Fousek J (2009) GPU acceleration of equations assembly in finite elements method—preliminary results. In: Symposium on application accelerators in HPC (SAAHPC)
George T, Saxena V, Gupta A, Singh A, Choudhury A (2011) Multifrontal factorization of sparse SPD matrices on GPUs. In: IEEE international parallel & distributed processing symposium (IPDPS), pp 372–383
Georgescu S, Chow P (2011) GPU accelerated CAE using open solvers and the cloud. Comput Archit News 39(4):14–19
Georgescu S, Okuda H (2010) Conjugate gradients on multiple GPUs. Int J Numer Methods Fluids 64:1254–1273
Geveler M, Ribbrock D, Göddeke D, Zajac P, Turek S (2011) Efficient finite element geometric multigrid solvers for unstructured grids on GPUs. In: Proceedings of the second international conference on parallel, distributed, grid and cloud computing for engineering, PARENG 2011. doi:10.4203/ccp.95.22
Göddeke D, Strzodka R, Turek S (2005) Accelerating double precision FEM simulations with GPUs. In: Hülsemann F, Kowarschik M, Rüde UA (eds) 18th symposium simulations technique, frontiers in simulation. SCS, San Diego, pp 139–144
Göddeke D, Strzodka R, Turek SA (2007) Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations. Int J Parallel Emerg Dist Syst 22(4):221–256
Göddeke D, Strzodka RA (2008) Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations (part 2: double precision GPUs). Tech rep, Fakultät für Mathematik, TU Dortmund (2008). Ergebnisberichte des Instituts für Angewandte Mathematik, nummer 370
Göhner U (2012) Usage of GPU in LS-DYNA. LS-DYNA forum
Haase G, Liebmann M, Douglas C, Plank G (2010) A parallel algebraic multigrid solver on graphics processing units. In: High performance computing and applications, pp 38–47
Heuveline V, Lukarski D, Trost N, Weiss JP (2012) Parallel smoothers for matrix-based geometric multigrid methods on locally refined meshes using multicore CPUs and GPUs. In: Keller R, Kramer D, Weiss JP (eds) Facing the multicore-challenge II. Springer, Berlin, pp 158–171
Hjelmervik J, Léon J (2007) GPU-accelerated shape simplification for mechanical-based applications. In: IEEE international conference on shape modeling and applications, SMI’07. IEEE Press, New York, pp 91–102
Kamiabad A (2011) Implementing a preconditioned iterative linear solver using massively parallel graphics processing units. Master’s thesis, University of Toronto
Kraus J, Foster M (2012) Efficient AMG on heterogeneous systems. In: Keller R, Kramer D, Weiss JP (eds) Facing the multicore—challenge II. Lecture notes in computer science, vol 7174. Springer, Berlin, pp 133–146
Krawezik G, Poole G (2009) Accelerating the ANSYS direct sparse solver with GPUs. In: Symposium on application accelerators in high performance computing (SAAHPC’09)
Krüger J, Westermann R (2003) Linear algebra operators for GPU implementation of numerical algorithms. ACM Trans Graph 22:908–916
Lacoste X, Ramet P, Faverge M, Ichitaro Y, Dongarra J et al (2012) Sparse direct solvers with accelerators over DAG runtimes. Tech rep 7972, INRIA
LAToolbox from HiFlow http://www.hiflow3.org
Lequiniou E, Zhou H (2012) Speedup Altair RADIOSS solvers using NVIDIA GPU. In: GPU technology conference, GTC 2012
Li R, Saad Y (2010) GPU-accelerated preconditioned iterative linear solvers. Tech rep, University of Minnesota
Liao C (2012) MSC Nastran sparse direct solvers for Tesla GPUs. In: GPU technology conference, GTC 2012
Lucas R, Wagenbreth G, Tran J, Davis D (2007) Multifrontal computations on GPUs. Tech rep, Unpublished ISI white paper
Luitjens J, Williams A, Heroux M (2012) Optimizing miniFE an implicit finite element application on GPUs. In: GPU technology conference, GTC 2012
Maciol P, Plaszewski P, Banas K (2010) 3D finite element numerical integration on GPUs. Proc Comput Sci 1(1):1087–1094
MatrixPro-GSS http://www.matrixprosoftware.com/
Minden V, Smith B, Knepley M (2010) Preliminary Implementation of PETSc Using GPUs. In: Proceedings of the 2010 international workshop of GPU solutions to multiscale problems in science and engineering
Naumov M (2011) Incomplete-LU and Cholesky preconditioned iterative methods using CUSPARSE and CUBLAS. Technical report and white paper
Neic A, Liebmann M, Haase G (2012) Algebraic multigrid solver on clusters of CPUs and GPUs. In: Applied parallel and scientific computing, pp 389–398
NVIDIA (2012) NVIDIA CUDA programming guide 5.0
Płaszewski P, Macioł P, Banaś K (2010) Finite element numerical integration on GPUs. In: Parallel processing and applied mathematics, pp 411–420
Posey S, Courteille F (2012) GPU progress in sparse matrix solvers for applications in computational mechanics. In: European seminar on computing, ESCO’12
Qi M, Cao TT, Tan TS (2012) Computing 2D constrained Delaunay triangulation using the GPU. In: Proceedings of the ACM SIGGRAPH symposium on interactive 3D graphics and games, I3D’12. ACM, New York, pp 39–46
Rong G, Tan T, Cao T et al. (2008) Computing two-dimensional Delaunay triangulation using graphics hardware. In: Proceedings of the 2008 symposium on interactive 3D graphics and games. ACM, New York, pp 89–97
Sawyer W, Vanini C, Fourestey G, Popescu R (2012) SPAI preconditioners for HPC applications. PAMM 12(1):651–652
Schenk O, Christen M, Burkhart H (2008) Algorithmic performance studies on graphics processing units. J Parallel Distrib Comput 68(10):1360–1369
Shontz SM, Nistor DM (2013) CPU-GPU algorithms for triangular surface mesh simplification. In: Jiao X, Weill JC (eds) Proceedings of the 21st international meshing roundtable. Springer, Berlin, pp 475–492
The Khronos Group (2011) OpenCL specification 1.2
Verschoor M, Jalba AC (2012) Analysis and performance estimation of the conjugate gradient method on multiple GPUs. Parallel Comput 38:552–575
ViennaCL http://viennacl.sourceforge.net/
Vuduc R, Chandramowlishwaran A, Choi J, Guney M, Shringarpure A (2010) On the limits of GPU acceleration. In: Proceedings of the 2nd USENIX conference on hot topics in parallelism, p 13
Wagner M, Rupp K, Weinbub J (2012) A comparison of algebraic multigrid preconditioners using graphics processing units and multi-core central processing units. In: Proceedings of the 2012 symposium on high performance computing, HPC’12, pp 1–8
Wang M, Klie H, Parashar M, Sudan H (2009) Solving sparse linear systems on NVIDIA Tesla GPUs. In: Computational science, ICCS 2009, pp 864–873
Weber D, Bender J, Schnoes M, Stork A, Fellner D (2013) Efficient GPU data structures and methods to solve sparse linear systems in dynamics applications. Comput Graph Forum 32(1):16–26. doi:10.1111/j.1467-8659.2012.03227.x
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Georgescu, S., Chow, P. & Okuda, H. GPU Acceleration for FEM-Based Structural Analysis. Arch Computat Methods Eng 20, 111–121 (2013). https://doi.org/10.1007/s11831-013-9082-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11831-013-9082-8