Skip to main content

2016 | OriginalPaper | Buchkapitel

Efficient Implementation of Total FETI Solver for Graphic Processing Units Using Schur Complement

verfasst von : Lubomír Říha, Tomáš Brzobohatý, Alexandros Markopoulos, Tomáš Kozubek, Ondřej Meca, Olaf Schenk, Wim Vanroose

Erschienen in: High Performance Computing in Science and Engineering

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper presents a new approach developed for acceleration of FETI solvers by Graphic Processing Units (GPU) using the Schur complement (SC) technique. By using the SCs FETI solvers can avoid working with sparse Cholesky decomposition of the stiffness matrices. Instead a dense structure in form of SC is computed and used by conjugate gradient (CG) solver. In every iteration of CG solver a forward and backward substitution which are sequential are replaced by highly parallel General Matrix Vector Multiplication (GEMV) routine. This results in 4.1 times speedup when the Tesla K20X GPU accelerator is used and its performance is compared to a single 16-core AMD Opteron 6274 (Interlagos) CPU.
The main bottleneck of this method is computation of the Schur complements of the stiffness matrices. This bottleneck is significantly reduced by using new PARDISO-SC sparse direct solver. This paper also presents the performance evaluation of SC computations for three-dimensional elasticity stiffness matrices.
We present the performance evaluation of the proposed approach using our implementation in the ESPRESO solver package.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Farhat, C., Roux, F.-X.: An unconventional domain decomposition method for an efficient parallel solution of large-scale finite element systems. SIAM J. Sci. Stat. Comput. 13, 379–396 (1992)MathSciNetCrossRefMATH Farhat, C., Roux, F.-X.: An unconventional domain decomposition method for an efficient parallel solution of large-scale finite element systems. SIAM J. Sci. Stat. Comput. 13, 379–396 (1992)MathSciNetCrossRefMATH
2.
Zurück zum Zitat Dostál, Z., Horák, D., Kučera, R.: Total FETI - an easier implementable variant of the FETI method for numerical solution of elliptic PDE. Commun. Numer. Methods Eng. 22(12), 1155–1162 (2006)MathSciNetCrossRefMATH Dostál, Z., Horák, D., Kučera, R.: Total FETI - an easier implementable variant of the FETI method for numerical solution of elliptic PDE. Commun. Numer. Methods Eng. 22(12), 1155–1162 (2006)MathSciNetCrossRefMATH
3.
Zurück zum Zitat Brzobohatý, T., Dostál, Z., Kozubek, T., Kovář, P., Markopoulos, A.: Cholesky decomposition with fixing nodes to stable computation of a generalized inverse of the stiffness matrix of a floating structure. Int. J. Numer. Methods Eng. 88(5), 493–509 (2011). doi:10.1002/nme.3187 MathSciNetCrossRefMATH Brzobohatý, T., Dostál, Z., Kozubek, T., Kovář, P., Markopoulos, A.: Cholesky decomposition with fixing nodes to stable computation of a generalized inverse of the stiffness matrix of a floating structure. Int. J. Numer. Methods Eng. 88(5), 493–509 (2011). doi:10.​1002/​nme.​3187 MathSciNetCrossRefMATH
5.
Zurück zum Zitat Kučera, R., Kozubek, T., Markopoulos, A.: On large-scale generalized inverses in solving two-by-two block linear systems. Linear Algebra Appl. 438(7), 3011–3029 (2013)MathSciNetCrossRefMATH Kučera, R., Kozubek, T., Markopoulos, A.: On large-scale generalized inverses in solving two-by-two block linear systems. Linear Algebra Appl. 438(7), 3011–3029 (2013)MathSciNetCrossRefMATH
6.
Zurück zum Zitat Farhat, C., Mandel, J., Roux, F.-X.: Optimal convergence properties of the FETI domain decomposition method. Comput. Methods Appl. Mech. Eng. 115, 365–385 (1994)MathSciNetCrossRef Farhat, C., Mandel, J., Roux, F.-X.: Optimal convergence properties of the FETI domain decomposition method. Comput. Methods Appl. Mech. Eng. 115, 365–385 (1994)MathSciNetCrossRef
7.
Zurück zum Zitat Roux, F.-X., Farhat, C.: Parallel implementation of direct solution strategies for the coarse grid solvers in 2-level FETI method. Contemp. Math. 218, 158–173 (1998)MathSciNetCrossRefMATH Roux, F.-X., Farhat, C.: Parallel implementation of direct solution strategies for the coarse grid solvers in 2-level FETI method. Contemp. Math. 218, 158–173 (1998)MathSciNetCrossRefMATH
8.
Zurück zum Zitat Kozubek, T., Vondrák, V., Menšík, M., Horák, D., Dostál, Z., Hapla, V., Kabelikova, P., Cermak, M.: Total FETI domain decomposition method and its massively parallel implementation. Adv. Eng. Softw. 60, 14–22 (2013)CrossRef Kozubek, T., Vondrák, V., Menšík, M., Horák, D., Dostál, Z., Hapla, V., Kabelikova, P., Cermak, M.: Total FETI domain decomposition method and its massively parallel implementation. Adv. Eng. Softw. 60, 14–22 (2013)CrossRef
9.
Zurück zum Zitat Kuzmin, A., Luisier, M., Schenk, O.: Fast methods for computing selected elements of the green’s function in massively parallel nanoelectronic device simulations. In: Wolf, F., Mohr, B., an Mey, D. (eds.) Euro-Par 2013. LNCS, vol. 8097, pp. 533–544. Springer, Heidelberg (2013)CrossRef Kuzmin, A., Luisier, M., Schenk, O.: Fast methods for computing selected elements of the green’s function in massively parallel nanoelectronic device simulations. In: Wolf, F., Mohr, B., an Mey, D. (eds.) Euro-Par 2013. LNCS, vol. 8097, pp. 533–544. Springer, Heidelberg (2013)CrossRef
10.
Zurück zum Zitat Schenk, O., Bollhöfer, M., Römer, R.: On large-scale diagonalization techniques for the Anderson model of localization. Featured SIGEST paper in the SIAM Review selected “on the basis of its exceptional interest to the entire SIAM community”. SIAM Rev. 50, 91–112 (2008)MathSciNetCrossRefMATH Schenk, O., Bollhöfer, M., Römer, R.: On large-scale diagonalization techniques for the Anderson model of localization. Featured SIGEST paper in the SIAM Review selected “on the basis of its exceptional interest to the entire SIAM community”. SIAM Rev. 50, 91–112 (2008)MathSciNetCrossRefMATH
11.
12.
Zurück zum Zitat Petra, C., Schenk, O., Lubin, M., Gänter, K.: An augmented incomplete factorization approach for computing the Schur complement in stochastic optimization. SIAM J. Sci. Comput. 36(2), C139–C162 (2014). doi:10.1137/130908737 MathSciNetCrossRef Petra, C., Schenk, O., Lubin, M., Gänter, K.: An augmented incomplete factorization approach for computing the Schur complement in stochastic optimization. SIAM J. Sci. Comput. 36(2), C139–C162 (2014). doi:10.​1137/​130908737 MathSciNetCrossRef
13.
Zurück zum Zitat Hogg, J.D., Scott, J.A.: A note on the solve phase of a multicore solver, SFTC Rutherford Appleton Laboratory, Technical report, Science and Technology Facilities Council, June 2010 Hogg, J.D., Scott, J.A.: A note on the solve phase of a multicore solver, SFTC Rutherford Appleton Laboratory, Technical report, Science and Technology Facilities Council, June 2010
14.
Zurück zum Zitat Říha, L., Brzobohatý, T., Markopoulos, A.: Highly scalable FETI methods in ESPRESO. In: Ivnyi, P., Toppin, B.H.V. (eds.) Proceedings of the Fourth International Conference on Parallel, Distributed, Grid, Cloud Computing for Engineering, Civil-Comp Press, Stirlingshire, UK, Paper 17 (2015). doi:10.4203/ccp.107.17 Říha, L., Brzobohatý, T., Markopoulos, A.: Highly scalable FETI methods in ESPRESO. In: Ivnyi, P., Toppin, B.H.V. (eds.) Proceedings of the Fourth International Conference on Parallel, Distributed, Grid, Cloud Computing for Engineering, Civil-Comp Press, Stirlingshire, UK, Paper 17 (2015). doi:10.​4203/​ccp.​107.​17
Metadaten
Titel
Efficient Implementation of Total FETI Solver for Graphic Processing Units Using Schur Complement
verfasst von
Lubomír Říha
Tomáš Brzobohatý
Alexandros Markopoulos
Tomáš Kozubek
Ondřej Meca
Olaf Schenk
Wim Vanroose
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-40361-8_6