Skip to main content

2016 | OriginalPaper | Buchkapitel

Task-Based Cholesky Decomposition on Knights Corner Using OpenMP

verfasst von : Joseph Dorris, Jakub Kurzak, Piotr Luszczek, Asim YarKhan, Jack Dongarra

Erschienen in: High Performance Computing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The growing popularity of the Intel Xeon Phi coprocessors and the continued development of this new many-core architecture have created the need for an open-source, scalable, and cross-platform task-based dense linear algebra package that can efficiently use this type of hardware. In this paper, we examined the design modifications necessary when porting PLASMA, a task-based dense linear algebra library, to run effectively on Intel’s Knights Corner Xeon Phi coprocessor. First, we modified PLASMA’s tiled Cholesky decomposition to use OpenMP for its scheduling mechanism to enable Xeon Phi compatibility. We then compared the performance of our modified code to that of the original dynamic scheduler running on an Intel Xeon Sandy Bridge CPU. Finally, we looked at the performance of the new OpenMP tiled Cholesky decomposition on a Knights Corner coprocessor. We found that desirable performance for this architecture was attainable with the right code optimizations; these changes were necessary to account for differences in the runtimes and in the hardware itself.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Agullo, E., Demmel, J., Dongarra, J., Hadri, B., Kurzak, J., Langou, J., Ltaief, H., Luszczek, P., Tomov, S.: Numerical linear algebra on emerging architectures: the PLASMA and MAGMA projects. J. Phys. Conf. Ser. 180, 012037 (2009). IOP PublishingCrossRef Agullo, E., Demmel, J., Dongarra, J., Hadri, B., Kurzak, J., Langou, J., Ltaief, H., Luszczek, P., Tomov, S.: Numerical linear algebra on emerging architectures: the PLASMA and MAGMA projects. J. Phys. Conf. Ser. 180, 012037 (2009). IOP PublishingCrossRef
2.
Zurück zum Zitat Anderson, E., Bai, Z., Bischof, C., Blackford, S.L., Demmel, J.W., Dongarra, J.J., Croz, J.D., Greenbaum, A., Hammarling, S., McKenney, A., Sorensen, D.C.: LAPACK User’s Guide, 3rd edn. Society for Industrial and Applied Mathematics, Philadelphia (1999)CrossRefMATH Anderson, E., Bai, Z., Bischof, C., Blackford, S.L., Demmel, J.W., Dongarra, J.J., Croz, J.D., Greenbaum, A., Hammarling, S., McKenney, A., Sorensen, D.C.: LAPACK User’s Guide, 3rd edn. Society for Industrial and Applied Mathematics, Philadelphia (1999)CrossRefMATH
3.
Zurück zum Zitat Blackford, S., Dongarra, J.J.: Installation guide for LAPACK. Technical report 41, LAPACK Working Note, June 1999 (originally released March 1992) Blackford, S., Dongarra, J.J.: Installation guide for LAPACK. Technical report 41, LAPACK Working Note, June 1999 (originally released March 1992)
4.
Zurück zum Zitat Buttari, A., Langou, J., Kurzak, J., Dongarra, J.: A class of parallel tiled linear algebra algorithms for multicore architectures. Parallel Comput. 35(1), 38–53 (2009)MathSciNetCrossRef Buttari, A., Langou, J., Kurzak, J., Dongarra, J.: A class of parallel tiled linear algebra algorithms for multicore architectures. Parallel Comput. 35(1), 38–53 (2009)MathSciNetCrossRef
5.
Zurück zum Zitat Chrysos, G.: Intel\({\textregistered }\) Xeon Phi coprocessor-the architecture. Intel Whitepaper (2014) Chrysos, G.: Intel\({\textregistered }\) Xeon Phi coprocessor-the architecture. Intel Whitepaper (2014)
6.
Zurück zum Zitat Dagnum, L., Menon, R.: OpenMP: an industry-standard API for shared memory programming. IEEE Comput. Sci. Eng. 5(1), 46–55 (1998)CrossRef Dagnum, L., Menon, R.: OpenMP: an industry-standard API for shared memory programming. IEEE Comput. Sci. Eng. 5(1), 46–55 (1998)CrossRef
7.
Zurück zum Zitat Dolz, M.F., Igual, F.D., Ludwig, T., Piñuel, L., Quintana-Ortí, E.S.: Balancing task-and data-level parallelism to improve performance and energy consumption of matrix computations on the Intel Xeon Phi. Comput. Electr. Eng. 46, 95–111 (2015)CrossRef Dolz, M.F., Igual, F.D., Ludwig, T., Piñuel, L., Quintana-Ortí, E.S.: Balancing task-and data-level parallelism to improve performance and energy consumption of matrix computations on the Intel Xeon Phi. Comput. Electr. Eng. 46, 95–111 (2015)CrossRef
8.
Zurück zum Zitat Dongarra, J., Gates, M., Haidar, A., Jia, Y., Kabir, K., Luszczek, P., Tomov, S.: Portable HPC programming on Intel many-integrated-core hardware with MAGMA port to Xeon Phi. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Waśniewski, J. (eds.) PPAM 2013, Part I. LNCS, vol. 8384, pp. 571–581. Springer, Heidelberg (2014)CrossRef Dongarra, J., Gates, M., Haidar, A., Jia, Y., Kabir, K., Luszczek, P., Tomov, S.: Portable HPC programming on Intel many-integrated-core hardware with MAGMA port to Xeon Phi. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Waśniewski, J. (eds.) PPAM 2013, Part I. LNCS, vol. 8384, pp. 571–581. Springer, Heidelberg (2014)CrossRef
9.
Zurück zum Zitat Duran, A., Ayguadé, E., Badia, R.M., Labarta, J., Martinell, L., Martorell, X., Planas, J.: OmpSs: a proposal for programming heterogeneous multi-core architectures. Parallel Process. Lett. 21(02), 173–193 (2011)MathSciNetCrossRef Duran, A., Ayguadé, E., Badia, R.M., Labarta, J., Martinell, L., Martorell, X., Planas, J.: OmpSs: a proposal for programming heterogeneous multi-core architectures. Parallel Process. Lett. 21(02), 173–193 (2011)MathSciNetCrossRef
10.
Zurück zum Zitat Duran, A., Klemm, M.: The Intel many integrated core architecture. In: 2012 International Conference on High Performance Computing and Simulation (HPCS), pp. 365–366. IEEE (2012) Duran, A., Klemm, M.: The Intel many integrated core architecture. In: 2012 International Conference on High Performance Computing and Simulation (HPCS), pp. 365–366. IEEE (2012)
11.
Zurück zum Zitat Fang, J., Varbanescu, A.L., Sips, H., Zhang, L., Che, Y., Xu, C.: An empirical study of Intel Xeon Phi (2013). arXiv preprint: arXiv:1310.5842 Fang, J., Varbanescu, A.L., Sips, H., Zhang, L., Che, Y., Xu, C.: An empirical study of Intel Xeon Phi (2013). arXiv preprint: arXiv:​1310.​5842
12.
Zurück zum Zitat Kurzak, J., Ltaief, H., Dongarra, J., Badia, R.: Scheduling linear algebra operations on multicore processors. Concurr. Comput. Pract. Exp. 22, 15–44 (2010)CrossRef Kurzak, J., Ltaief, H., Dongarra, J., Badia, R.: Scheduling linear algebra operations on multicore processors. Concurr. Comput. Pract. Exp. 22, 15–44 (2010)CrossRef
13.
Zurück zum Zitat Lima, J.V., Broquedis, F., Gautier, T., Raffin, B.: Preliminary experiments with XKaapi on Intel Xeon Phi coprocessor. In: 2013 25th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), pp. 105–112. IEEE (2013) Lima, J.V., Broquedis, F., Gautier, T., Raffin, B.: Preliminary experiments with XKaapi on Intel Xeon Phi coprocessor. In: 2013 25th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), pp. 105–112. IEEE (2013)
14.
Zurück zum Zitat Quintana-Ortí, G., Quintana-Ortí, E.S., Geijn, R.A., Zee, F.G.V., Chan, E.: Programming matrix algorithms-by-blocks for thread-level parallelism. ACM Trans. Math. Softw. (TOMS) 36(3), 14 (2009)MathSciNetCrossRef Quintana-Ortí, G., Quintana-Ortí, E.S., Geijn, R.A., Zee, F.G.V., Chan, E.: Programming matrix algorithms-by-blocks for thread-level parallelism. ACM Trans. Math. Softw. (TOMS) 36(3), 14 (2009)MathSciNetCrossRef
16.
Zurück zum Zitat Schmidl, D., Cramer, T., Wienke, S., Terboven, C., Müller, M.S.: Assessing the performance of OpenMP programs on the Intel Xeon Phi. In: Mohr, B., Mey, D., Wolf, F. (eds.) Euro-Par 2013. LNCS, vol. 8097, pp. 547–558. Springer, Heidelberg (2013)CrossRef Schmidl, D., Cramer, T., Wienke, S., Terboven, C., Müller, M.S.: Assessing the performance of OpenMP programs on the Intel Xeon Phi. In: Mohr, B., Mey, D., Wolf, F. (eds.) Euro-Par 2013. LNCS, vol. 8097, pp. 547–558. Springer, Heidelberg (2013)CrossRef
17.
Zurück zum Zitat Trader, T.: Intel Debuts ‘Knights Landing’ Ninja Developer Platform. HPCwire, April 2016 Trader, T.: Intel Debuts ‘Knights Landing’ Ninja Developer Platform. HPCwire, April 2016
18.
Zurück zum Zitat Virouleau, P., Brunet, P., Broquedis, F., Furmento, N., Thibault, S., Aumage, O., Gautier, T.: Evaluation of OpenMP dependent tasks with the KASTORS benchmark suite. In: DeRose, L., Supinski, B.R., Olivier, S.L., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2014. LNCS, vol. 8766, pp. 16–29. Springer, Heidelberg (2014) Virouleau, P., Brunet, P., Broquedis, F., Furmento, N., Thibault, S., Aumage, O., Gautier, T.: Evaluation of OpenMP dependent tasks with the KASTORS benchmark suite. In: DeRose, L., Supinski, B.R., Olivier, S.L., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2014. LNCS, vol. 8766, pp. 16–29. Springer, Heidelberg (2014)
Metadaten
Titel
Task-Based Cholesky Decomposition on Knights Corner Using OpenMP
verfasst von
Joseph Dorris
Jakub Kurzak
Piotr Luszczek
Asim YarKhan
Jack Dongarra
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-46079-6_37

Neuer Inhalt