Skip to main content

2018 | OriginalPaper | Buchkapitel

Profiling and Debugging Support for the Kokkos Programming Model

verfasst von : Simon D. Hammond, Christian R. Trott, Daniel Ibanez, Daniel Sunderland

Erschienen in: High Performance Computing

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Supercomputing hardware is undergoing a period of significant change. In order to cope with the rapid pace of hardware and, in many cases, programming model innovation, we have developed the Kokkos Programming Model – a C++-based abstraction that permits performance portability across diverse architectures. Our experience has shown that the abstractions developed can significantly frustrate debugging and profiling activities because they break expected code proximity and layout assumptions. In this paper we present the Kokkos Profiling interface, a lightweight, suite of hooks to which debugging and profiling tools can attach to gain deep insights into the execution and data structure behaviors of parallel programs written to the Kokkos interface.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bartlett, R.A.: Teuchos C++ memory management classes, idioms, and related topics, the complete reference: a comprehensive strategy for safe and efficient memory management in C++ for high performance computing. Technical report, SAND2010-2234, Sandia National Laboratories (2010) Bartlett, R.A.: Teuchos C++ memory management classes, idioms, and related topics, the complete reference: a comprehensive strategy for safe and efficient memory management in C++ for high performance computing. Technical report, SAND2010-2234, Sandia National Laboratories (2010)
2.
Zurück zum Zitat Bell, N., Hoberock, J.: Thrust: a productivity-oriented library for CUDA. In: GPU Computing Gems Jade Edition, pp. 359–371. Elsevier (2011) Bell, N., Hoberock, J.: Thrust: a productivity-oriented library for CUDA. In: GPU Computing Gems Jade Edition, pp. 359–371. Elsevier (2011)
4.
Zurück zum Zitat Dagum, L., Menon, R.: OpenMP: an industry standard API for shared-memory programming. IEEE Comput. Sci. Eng. 5(1), 46–55 (1998)CrossRef Dagum, L., Menon, R.: OpenMP: an industry standard API for shared-memory programming. IEEE Comput. Sci. Eng. 5(1), 46–55 (1998)CrossRef
5.
Zurück zum Zitat Edwards, H.C., Sunderland, D., Porter, V., Amsler, C., Mish, S.: Manycore performance-portability: kokkos multidimensional array library. Sci. Program. 20(2), 89–114 (2012) Edwards, H.C., Sunderland, D., Porter, V., Amsler, C., Mish, S.: Manycore performance-portability: kokkos multidimensional array library. Sci. Program. 20(2), 89–114 (2012)
6.
Zurück zum Zitat Edwards, H.C., Trott, C.R.: Kokkos: enabling performance portability across manycore architectures. In: Extreme Scaling Workshop (XSW), pp. 18–24. IEEE (2013) Edwards, H.C., Trott, C.R.: Kokkos: enabling performance portability across manycore architectures. In: Extreme Scaling Workshop (XSW), pp. 18–24. IEEE (2013)
7.
Zurück zum Zitat Edwards, H.C., Trott, C.R., Sunderland, D.: Kokkos: enabling manycore performance portability through polymorphic memory access patterns. J. Parallel Distrib. Comput. 74(12), 3202–3216 (2014)CrossRef Edwards, H.C., Trott, C.R., Sunderland, D.: Kokkos: enabling manycore performance portability through polymorphic memory access patterns. J. Parallel Distrib. Comput. 74(12), 3202–3216 (2014)CrossRef
8.
Zurück zum Zitat Eichenberger, A., et al.: OMPT and OMPD: OpenMP tools application programming interfaces for performance analysis and debugging. In: International Workshop on OpenMP (IWOMP 2013) (2013) Eichenberger, A., et al.: OMPT and OMPD: OpenMP tools application programming interfaces for performance analysis and debugging. In: International Workshop on OpenMP (IWOMP 2013) (2013)
10.
Zurück zum Zitat Foley, D., Danskin, J.: Ultra-performance pascal GPU and NVLink interconnect. IEEE Micro 37(2), 7–17 (2017)CrossRef Foley, D., Danskin, J.: Ultra-performance pascal GPU and NVLink interconnect. IEEE Micro 37(2), 7–17 (2017)CrossRef
11.
Zurück zum Zitat Hammarlund, P., et al.: Haswell: the fourth-generation intel core processor. IEEE Micro 34(2), 6–20 (2014)CrossRef Hammarlund, P., et al.: Haswell: the fourth-generation intel core processor. IEEE Micro 34(2), 6–20 (2014)CrossRef
12.
Zurück zum Zitat Heroux, M.A., et al.: An overview of the trilinos project. ACM Trans. Math. Softw. (TOMS) 31(3), 397–423 (2005)MathSciNetCrossRef Heroux, M.A., et al.: An overview of the trilinos project. ACM Trans. Math. Softw. (TOMS) 31(3), 397–423 (2005)MathSciNetCrossRef
13.
Zurück zum Zitat Jain, T., Agrawal, T.: The haswell microarchitecture - 4th generation processor. Int. J. Comput. Sci. Inf. Technol. 4(3), 477–480 (2013) Jain, T., Agrawal, T.: The haswell microarchitecture - 4th generation processor. Int. J. Comput. Sci. Inf. Technol. 4(3), 477–480 (2013)
14.
15.
16.
Zurück zum Zitat Messina, P.: The U.S. D.O.E. Exascale Computing Project – Goals and Challenges, February 2017 Messina, P.: The U.S. D.O.E. Exascale Computing Project – Goals and Challenges, February 2017
17.
Zurück zum Zitat Nalamalpu, A., et al.: Broadwell: a family of IA 14nm processors. In: 2015 Symposium on VLSI Circuits (VLSI Circuits), pp. C314–C315. IEEE (2015) Nalamalpu, A., et al.: Broadwell: a family of IA 14nm processors. In: 2015 Symposium on VLSI Circuits (VLSI Circuits), pp. C314–C315. IEEE (2015)
18.
Zurück zum Zitat Pheatt, C.: Intel threading building blocks. J. Comput. Sci. Coll. 23(4), 298–298 (2008) Pheatt, C.: Intel threading building blocks. J. Comput. Sci. Coll. 23(4), 298–298 (2008)
20.
Zurück zum Zitat Schulz, M., Galarowicz, J., Maghrak, D., Hachfeld, W., Montoya, D., Cranford, S.: Open|SpeedShop: an open source infrastructure for parallel performance analysis. Sci. Programm. 16(2–3), 105–121 (2008) Schulz, M., Galarowicz, J., Maghrak, D., Hachfeld, W., Montoya, D., Cranford, S.: Open|SpeedShop: an open source infrastructure for parallel performance analysis. Sci. Programm. 16(2–3), 105–121 (2008)
21.
Zurück zum Zitat Shende, S.S., Malony, A.D.: The TAU parallel performance system. Int. J. High Perform. Comput. Appl. 20(2), 287–311 (2006)CrossRef Shende, S.S., Malony, A.D.: The TAU parallel performance system. Int. J. High Perform. Comput. Appl. 20(2), 287–311 (2006)CrossRef
22.
Zurück zum Zitat Sodani, A.: Knights landing (KNL): 2nd generation Intel Xeon Phi processor. In: 2015 IEEE Hot Chips 27 Symposium (HCS), pp. 1–24. IEEE (2015) Sodani, A.: Knights landing (KNL): 2nd generation Intel Xeon Phi processor. In: 2015 IEEE Hot Chips 27 Symposium (HCS), pp. 1–24. IEEE (2015)
23.
Zurück zum Zitat Sodani, A., et al.: Knights landing: second-generation Intel Xeon Phi product. IEEE Micro 36(2), 34–46 (2016)CrossRef Sodani, A., et al.: Knights landing: second-generation Intel Xeon Phi product. IEEE Micro 36(2), 34–46 (2016)CrossRef
24.
Zurück zum Zitat Tallent, N., Mellor-Crummey, J., Adhianto, L., Fagan, M., Krentel, M.: HPCToolkit: performance tools for scientific computing. In: Journal of Physics: Conference Series, vol. 125, p. 012088. IOP Publishing (2008) Tallent, N., Mellor-Crummey, J., Adhianto, L., Fagan, M., Krentel, M.: HPCToolkit: performance tools for scientific computing. In: Journal of Physics: Conference Series, vol. 125, p. 012088. IOP Publishing (2008)
25.
Zurück zum Zitat Thompto, B.: POWER9: processor for the cognitive era. In: 2016 IEEE Hot Chips 28 Symposium (HCS), pp. 1–19. IEEE (2016) Thompto, B.: POWER9: processor for the cognitive era. In: 2016 IEEE Hot Chips 28 Symposium (HCS), pp. 1–19. IEEE (2016)
26.
Zurück zum Zitat Zenker, E., et al.: Alpaka-an abstraction library for parallel kernel acceleration. In: 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, pp. 631–640. IEEE (2016) Zenker, E., et al.: Alpaka-an abstraction library for parallel kernel acceleration. In: 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, pp. 631–640. IEEE (2016)
Metadaten
Titel
Profiling and Debugging Support for the Kokkos Programming Model
verfasst von
Simon D. Hammond
Christian R. Trott
Daniel Ibanez
Daniel Sunderland
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-030-02465-9_53

Premium Partner