Skip to main content
Top
Published in: The Journal of Supercomputing 1/2014

01-07-2014

Parallel sparse linear solver with GMRES method using minimization techniques of communications for GPU clusters

Authors: Lilia Ziane Khodja, Raphaël Couturier, Arnaud Giersch, Jacques M. Bahi

Published in: The Journal of Supercomputing | Issue 1/2014

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this paper, we aim at exploiting the power computing of a graphics processing unit (GPU) cluster for solving large sparse linear systems. We implement the parallel algorithm of the generalized minimal residual iterative method using the Compute Unified Device Architecture programming language and the MPI parallel environment. The experiments show that a GPU cluster is more efficient than a CPU cluster. In order to optimize the performances, we use a compressed storage format for the sparse vectors and the hypergraph partitioning. These solutions improve the spatial and temporal localization of the shared data between the computing nodes of the GPU cluster.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Ament M, Knittel G, Weiskopf D, Strasser W (2010) A parallel preconditioned conjugate gradient solver for the poisson problem on a multi-GPU platform. In: Proceedings of the 2010 18th Euromicro conference on parallel, distributed and network-based processing, IEEE Computer Society, pp 583–592 Ament M, Knittel G, Weiskopf D, Strasser W (2010) A parallel preconditioned conjugate gradient solver for the poisson problem on a multi-GPU platform. In: Proceedings of the 2010 18th Euromicro conference on parallel, distributed and network-based processing, IEEE Computer Society, pp 583–592
2.
go back to reference Arnoldi W (1951) The principle of minimized iteration in the solution of the matrix eigenvalue problem. Quart Appl Math 9:17–29MATHMathSciNet Arnoldi W (1951) The principle of minimized iteration in the solution of the matrix eigenvalue problem. Quart Appl Math 9:17–29MATHMathSciNet
3.
go back to reference Bahi J, Contassot-Vivier S, Couturier R (2008) Parallel iterative algorithms: from sequential to grid computing. In: Numerical analysis and scientific computing. Chapman & Hall/CRC Bahi J, Contassot-Vivier S, Couturier R (2008) Parallel iterative algorithms: from sequential to grid computing. In: Numerical analysis and scientific computing. Chapman & Hall/CRC
4.
go back to reference Bahi J, Couturier R, Ziane Khodja L (2011) Parallel GMRES implementation for solving sparse linear systems on GPU clusters. In: Proceedings of the 19th high performance computing symposia, HPC ’11, SCS, International, pp 12–19 Bahi J, Couturier R, Ziane Khodja L (2011) Parallel GMRES implementation for solving sparse linear systems on GPU clusters. In: Proceedings of the 19th high performance computing symposia, HPC ’11, SCS, International, pp 12–19
5.
go back to reference Bahi J, Couturier R, Ziane Khodja L (2012) Parallel sparse linear solver gmres for gpu clusters with compression of exchanged data. In: Euro-Par 2011: parallel processing workshops, volume 7155 of LNCS, Springer, pp 471–480 Bahi J, Couturier R, Ziane Khodja L (2012) Parallel sparse linear solver gmres for gpu clusters with compression of exchanged data. In: Euro-Par 2011: parallel processing workshops, volume 7155 of LNCS, Springer, pp 471–480
6.
go back to reference Bell N, Garland M (2009) Implementing sparse matrix-vector multiplication on throughput-oriented processors. In: SC’09, Portland, Oregon, ACM, pp 1–11 Bell N, Garland M (2009) Implementing sparse matrix-vector multiplication on throughput-oriented processors. In: SC’09, Portland, Oregon, ACM, pp 1–11
7.
go back to reference Bolz J, Farmer I, Grinspun E, Schröder P (2003) Sparse matrix solvers on the GPU: conjugate gradients and multigrid. ACM Trans Graph 22(3):917–924CrossRef Bolz J, Farmer I, Grinspun E, Schröder P (2003) Sparse matrix solvers on the GPU: conjugate gradients and multigrid. ACM Trans Graph 22(3):917–924CrossRef
8.
go back to reference Çatalyürek Ü, Aykanat C (1999) Hypergraph-partitioning-based decomposition for parallel sparse-matrix vector multiplication. IEEE Trans Parallel Distrib Syst 10(7):673–693 Çatalyürek Ü, Aykanat C (1999) Hypergraph-partitioning-based decomposition for parallel sparse-matrix vector multiplication. IEEE Trans Parallel Distrib Syst 10(7):673–693
10.
go back to reference Cevahir A, Nukada A, Matsuoka S (2009) Fast conjugate gradients with multiple GPUs. In: Computational science ICCS 2009, volume 5544 of LNCS, Springer, pp 893–903 Cevahir A, Nukada A, Matsuoka S (2009) Fast conjugate gradients with multiple GPUs. In: Computational science ICCS 2009, volume 5544 of LNCS, Springer, pp 893–903
11.
go back to reference Cevahir A, Nukada A, Matsuoka S (2010) High performance conjugate gradient solver on multi-GPU clusters using hypergraph partitioning. Comput Sci Res Dev 25:83–91CrossRef Cevahir A, Nukada A, Matsuoka S (2010) High performance conjugate gradient solver on multi-GPU clusters using hypergraph partitioning. Comput Sci Res Dev 25:83–91CrossRef
12.
go back to reference Chen C, Taha T (2013) A communication reduction approach to iteratively solve large sparse linear systems on a GPGPU cluster. Cluster Comput 1–11 Chen C, Taha T (2013) A communication reduction approach to iteratively solve large sparse linear systems on a GPGPU cluster. Cluster Comput 1–11
13.
go back to reference Contassot-Vivier S, Jost T, Vialle S (2012) Impact of asynchronism on GPU accelerated parallel iterative computations. In: Applied parallel and scientific computing, volume 7133 of LNCS, Springer, pp 43–53 Contassot-Vivier S, Jost T, Vialle S (2012) Impact of asynchronism on GPU accelerated parallel iterative computations. In: Applied parallel and scientific computing, volume 7133 of LNCS, Springer, pp 43–53
14.
go back to reference Couturier R, Domas S (2012) Sparse systems solving on GPUs with GMRES. J Supercomput 59(3):1504–1516CrossRef Couturier R, Domas S (2012) Sparse systems solving on GPUs with GMRES. J Supercomput 59(3):1504–1516CrossRef
17.
go back to reference Devine K, Boman E, Heaphy R, Bisseling R, Çatalyürek Ü (2006) Parallel hypergraph partitioning for scientific computing. In: Proceedings of the 20th international conference on parallel and distributed processing, IPDPS’06, IEEE Computer Society, pp 124–124 Devine K, Boman E, Heaphy R, Bisseling R, Çatalyürek Ü (2006) Parallel hypergraph partitioning for scientific computing. In: Proceedings of the 20th international conference on parallel and distributed processing, IPDPS’06, IEEE Computer Society, pp 124–124
18.
go back to reference DeVries B, Iannelli J, Trefftz C, O’Hearn K, Wolffe G (2013) Parallel implementations of FGMRES for solving large, sparse non-symmetric linear systems. Proc Comput Sci 18:491–500CrossRef DeVries B, Iannelli J, Trefftz C, O’Hearn K, Wolffe G (2013) Parallel implementations of FGMRES for solving large, sparse non-symmetric linear systems. Proc Comput Sci 18:491–500CrossRef
19.
go back to reference Gaikwad A, Toke I (2010) Parallel iterative linear solvers on GPU: a financial engineering case. In: Proceedings of the 2010 18th Euromicro conference on parallel, distributed and network-based processing, IEEE Computer Society, pp 607–614 Gaikwad A, Toke I (2010) Parallel iterative linear solvers on GPU: a financial engineering case. In: Proceedings of the 2010 18th Euromicro conference on parallel, distributed and network-based processing, IEEE Computer Society, pp 607–614
20.
go back to reference Ghaemian N, Abdollahzadeh A, Heinemann Z, Harrer A, Sharifi M, Heinemann G (2008) Accelerating the GMRES iterative linear solver of an oil reservoir simulator using the multi-processing power of compute unified device architecture of graphics cards. In: PARA 2008 Ghaemian N, Abdollahzadeh A, Heinemann Z, Harrer A, Sharifi M, Heinemann G (2008) Accelerating the GMRES iterative linear solver of an oil reservoir simulator using the multi-processing power of compute unified device architecture of graphics cards. In: PARA 2008
21.
go back to reference Göddeke D, Strzodka R, Mohd-Yusof J, McCormick P, Buijssen S, Grajewski M, Turek S (2007) Exploring weak scalability for FEM calculations on a GPU-enhanced cluster. Parallel Comput Spec Issue High-perform Comput Accel 33(10–11):685–699 Göddeke D, Strzodka R, Mohd-Yusof J, McCormick P, Buijssen S, Grajewski M, Turek S (2007) Exploring weak scalability for FEM calculations on a GPU-enhanced cluster. Parallel Comput Spec Issue High-perform Comput Accel 33(10–11):685–699
22.
go back to reference Haase G, Liebmann M, Douglas C, Plank G (2010) A parallel algebraic multigrid solver on graphics processing units. In: High performance computing and applications, volume 5938 of LNCS, Springer, pp 38–47 Haase G, Liebmann M, Douglas C, Plank G (2010) A parallel algebraic multigrid solver on graphics processing units. In: High performance computing and applications, volume 5938 of LNCS, Springer, pp 38–47
23.
go back to reference Jost T, Contassot-Vivier S, Vialle S (2009) An efficient multi-algorithms sparse linear solver for GPUs. In International conference on parallel computing, ParCo2009 Jost T, Contassot-Vivier S, Vialle S (2009) An efficient multi-algorithms sparse linear solver for GPUs. In International conference on parallel computing, ParCo2009
25.
go back to reference Li R, Saad Y (2013) GPU-accelerated preconditioned iterative linear solvers. J Supercomput 63(2):443–466CrossRef Li R, Saad Y (2013) GPU-accelerated preconditioned iterative linear solvers. J Supercomput 63(2):443–466CrossRef
26.
go back to reference Neic A, Liebmann M, Haase G, Plank G (2012) Algebraic multigrid solver on clusters of CPUs and GPUs. In: Applied parallel and scientific computing, volume 7134 of LNCS, Springer, pp 389–398 Neic A, Liebmann M, Haase G, Plank G (2012) Algebraic multigrid solver on clusters of CPUs and GPUs. In: Applied parallel and scientific computing, volume 7134 of LNCS, Springer, pp 389–398
27.
go back to reference NVIDIA Corporation (2012) CUDA Toolkit 4.2 CUBLAS Library. NVIDIA Corporation (2012) CUDA Toolkit 4.2 CUBLAS Library.
28.
go back to reference NVIDIA Corporation (2012) NVIDIA CUDA C Programming Guide. NVIDIA Corporation (2012) NVIDIA CUDA C Programming Guide.
29.
31.
go back to reference Saad Y, Schultz M (1986) GMRES : a generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J Sci Stat Comput 7(3):856–869CrossRefMATHMathSciNet Saad Y, Schultz M (1986) GMRES : a generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J Sci Stat Comput 7(3):856–869CrossRefMATHMathSciNet
32.
go back to reference Wang M, Klie H, Parashar M, Sudan H (2009) Solving sparse linear systems on NVIDIA Tesla GPUs. In: Computational science ICCS 2009, volume 5544 of LNCS, Springer, pp 864–873 Wang M, Klie H, Parashar M, Sudan H (2009) Solving sparse linear systems on NVIDIA Tesla GPUs. In: Computational science ICCS 2009, volume 5544 of LNCS, Springer, pp 864–873
33.
go back to reference Weber D, Bender J, Schnoes M, Stork A, Fellner D (2013) Efficient GPU data structures and methods to solve sparse linear systems in dynamics applications. Comput Graph Forum 32:16–26CrossRef Weber D, Bender J, Schnoes M, Stork A, Fellner D (2013) Efficient GPU data structures and methods to solve sparse linear systems in dynamics applications. Comput Graph Forum 32:16–26CrossRef
34.
go back to reference Zhao N, Wang X (2012) A parallel preconditioned Bi-Conjugate Gradient stabilized solver for the Poisson problem. J Comput 7(12): 3088–3095 Zhao N, Wang X (2012) A parallel preconditioned Bi-Conjugate Gradient stabilized solver for the Poisson problem. J Comput 7(12): 3088–3095
Metadata
Title
Parallel sparse linear solver with GMRES method using minimization techniques of communications for GPU clusters
Authors
Lilia Ziane Khodja
Raphaël Couturier
Arnaud Giersch
Jacques M. Bahi
Publication date
01-07-2014
Publisher
Springer US
Published in
The Journal of Supercomputing / Issue 1/2014
Print ISSN: 0920-8542
Electronic ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-014-1143-8

Other articles of this Issue 1/2014

The Journal of Supercomputing 1/2014 Go to the issue

Premium Partner