Skip to main content
main-content
Top

Hint

Swipe to navigate through the articles of this issue

02-03-2020 | Original Paper | Issue 1/2021

Numerical Algorithms 1/2021

Parallel reduction of four matrices to condensed form for a generalized matrix eigenvalue algorithm

Journal:
Numerical Algorithms > Issue 1/2021
Author:
Nela Bosner
Important notes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Abstract

The VZ algorithm proposed by Charles F. Van Loan (SIMA, 1975) attempts to solve the generalized type of matrix eigenvalue problem ACx = λBDx, where A, BRn×m, C, DRm×n, and mn, without forming products and inverses. Especially, this algorithm is suitable for solving the generalized singular value problem. Van Loan’s approach first reduces the matrices A, B, C, and D to a condensed form by the finite step initial reduction. The reduction finds orthogonal matrices Q, U, V, and Z, such that QAZ is upper Hessenberg, and QBV, ZTCU, and VTDU are upper triangular. In this initial reduction, A is reduced to upper Hessenberg form, while simultaneously preserving triangularity of other three matrices. This is done by Givens rotations, annihilating one by one element of A, and by generating three more rotations applied to other matrices per each annihilation. Such an algorithm is quite inefficient. In our work, we propose a blocked algorithm for the initial reduction, based on aggregated Givens rotations and matrix–matrix multiplications, which are applied in the outer loop updates. This algorithm has another level of blocking, exploited in the inner loop. Further, we also consider a variant of the algorithm in a hybrid CPU–GPU framework, where compute-intensive outer loop updates are performed on GPU, and can be overlapped with the reduction in the next step performed on CPU. On the other hand, application of a sequence of rotations in the inner loop is parallelized on CPU, with balanced operation count per thread. Since a large number of aggregated rotations are produced in every outer loop step, they are simultaneously accumulated before outer loop updates. These adjustments speed up original initial reduction considerably which is confirmed by numerical experiments, and the efficiency of the whole VZ algorithm is increased.

Please log in to get access to this content

To get access to this content you need the following product:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 58.000 Bücher
  • über 300 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Testen Sie jetzt 30 Tage kostenlos.

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 50.000 Bücher
  • über 380 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Umwelt
  • Maschinenbau + Werkstoffe




Testen Sie jetzt 30 Tage kostenlos.

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 69.000 Bücher
  • über 500 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Umwelt
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Testen Sie jetzt 30 Tage kostenlos.

Literature
About this article

Other articles of this Issue 1/2021

Numerical Algorithms 1/2021 Go to the issue

Premium Partner

    Image Credits