Skip to main content
Top

2018 | OriginalPaper | Chapter

Profiling and Debugging Support for the Kokkos Programming Model

Authors : Simon D. Hammond, Christian R. Trott, Daniel Ibanez, Daniel Sunderland

Published in: High Performance Computing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Supercomputing hardware is undergoing a period of significant change. In order to cope with the rapid pace of hardware and, in many cases, programming model innovation, we have developed the Kokkos Programming Model – a C++-based abstraction that permits performance portability across diverse architectures. Our experience has shown that the abstractions developed can significantly frustrate debugging and profiling activities because they break expected code proximity and layout assumptions. In this paper we present the Kokkos Profiling interface, a lightweight, suite of hooks to which debugging and profiling tools can attach to gain deep insights into the execution and data structure behaviors of parallel programs written to the Kokkos interface.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Bartlett, R.A.: Teuchos C++ memory management classes, idioms, and related topics, the complete reference: a comprehensive strategy for safe and efficient memory management in C++ for high performance computing. Technical report, SAND2010-2234, Sandia National Laboratories (2010) Bartlett, R.A.: Teuchos C++ memory management classes, idioms, and related topics, the complete reference: a comprehensive strategy for safe and efficient memory management in C++ for high performance computing. Technical report, SAND2010-2234, Sandia National Laboratories (2010)
2.
go back to reference Bell, N., Hoberock, J.: Thrust: a productivity-oriented library for CUDA. In: GPU Computing Gems Jade Edition, pp. 359–371. Elsevier (2011) Bell, N., Hoberock, J.: Thrust: a productivity-oriented library for CUDA. In: GPU Computing Gems Jade Edition, pp. 359–371. Elsevier (2011)
4.
go back to reference Dagum, L., Menon, R.: OpenMP: an industry standard API for shared-memory programming. IEEE Comput. Sci. Eng. 5(1), 46–55 (1998)CrossRef Dagum, L., Menon, R.: OpenMP: an industry standard API for shared-memory programming. IEEE Comput. Sci. Eng. 5(1), 46–55 (1998)CrossRef
5.
go back to reference Edwards, H.C., Sunderland, D., Porter, V., Amsler, C., Mish, S.: Manycore performance-portability: kokkos multidimensional array library. Sci. Program. 20(2), 89–114 (2012) Edwards, H.C., Sunderland, D., Porter, V., Amsler, C., Mish, S.: Manycore performance-portability: kokkos multidimensional array library. Sci. Program. 20(2), 89–114 (2012)
6.
go back to reference Edwards, H.C., Trott, C.R.: Kokkos: enabling performance portability across manycore architectures. In: Extreme Scaling Workshop (XSW), pp. 18–24. IEEE (2013) Edwards, H.C., Trott, C.R.: Kokkos: enabling performance portability across manycore architectures. In: Extreme Scaling Workshop (XSW), pp. 18–24. IEEE (2013)
7.
go back to reference Edwards, H.C., Trott, C.R., Sunderland, D.: Kokkos: enabling manycore performance portability through polymorphic memory access patterns. J. Parallel Distrib. Comput. 74(12), 3202–3216 (2014)CrossRef Edwards, H.C., Trott, C.R., Sunderland, D.: Kokkos: enabling manycore performance portability through polymorphic memory access patterns. J. Parallel Distrib. Comput. 74(12), 3202–3216 (2014)CrossRef
8.
go back to reference Eichenberger, A., et al.: OMPT and OMPD: OpenMP tools application programming interfaces for performance analysis and debugging. In: International Workshop on OpenMP (IWOMP 2013) (2013) Eichenberger, A., et al.: OMPT and OMPD: OpenMP tools application programming interfaces for performance analysis and debugging. In: International Workshop on OpenMP (IWOMP 2013) (2013)
10.
go back to reference Foley, D., Danskin, J.: Ultra-performance pascal GPU and NVLink interconnect. IEEE Micro 37(2), 7–17 (2017)CrossRef Foley, D., Danskin, J.: Ultra-performance pascal GPU and NVLink interconnect. IEEE Micro 37(2), 7–17 (2017)CrossRef
11.
go back to reference Hammarlund, P., et al.: Haswell: the fourth-generation intel core processor. IEEE Micro 34(2), 6–20 (2014)CrossRef Hammarlund, P., et al.: Haswell: the fourth-generation intel core processor. IEEE Micro 34(2), 6–20 (2014)CrossRef
12.
13.
go back to reference Jain, T., Agrawal, T.: The haswell microarchitecture - 4th generation processor. Int. J. Comput. Sci. Inf. Technol. 4(3), 477–480 (2013) Jain, T., Agrawal, T.: The haswell microarchitecture - 4th generation processor. Int. J. Comput. Sci. Inf. Technol. 4(3), 477–480 (2013)
14.
15.
16.
go back to reference Messina, P.: The U.S. D.O.E. Exascale Computing Project – Goals and Challenges, February 2017 Messina, P.: The U.S. D.O.E. Exascale Computing Project – Goals and Challenges, February 2017
17.
go back to reference Nalamalpu, A., et al.: Broadwell: a family of IA 14nm processors. In: 2015 Symposium on VLSI Circuits (VLSI Circuits), pp. C314–C315. IEEE (2015) Nalamalpu, A., et al.: Broadwell: a family of IA 14nm processors. In: 2015 Symposium on VLSI Circuits (VLSI Circuits), pp. C314–C315. IEEE (2015)
18.
go back to reference Pheatt, C.: Intel threading building blocks. J. Comput. Sci. Coll. 23(4), 298–298 (2008) Pheatt, C.: Intel threading building blocks. J. Comput. Sci. Coll. 23(4), 298–298 (2008)
20.
go back to reference Schulz, M., Galarowicz, J., Maghrak, D., Hachfeld, W., Montoya, D., Cranford, S.: Open|SpeedShop: an open source infrastructure for parallel performance analysis. Sci. Programm. 16(2–3), 105–121 (2008) Schulz, M., Galarowicz, J., Maghrak, D., Hachfeld, W., Montoya, D., Cranford, S.: Open|SpeedShop: an open source infrastructure for parallel performance analysis. Sci. Programm. 16(2–3), 105–121 (2008)
21.
go back to reference Shende, S.S., Malony, A.D.: The TAU parallel performance system. Int. J. High Perform. Comput. Appl. 20(2), 287–311 (2006)CrossRef Shende, S.S., Malony, A.D.: The TAU parallel performance system. Int. J. High Perform. Comput. Appl. 20(2), 287–311 (2006)CrossRef
22.
go back to reference Sodani, A.: Knights landing (KNL): 2nd generation Intel Xeon Phi processor. In: 2015 IEEE Hot Chips 27 Symposium (HCS), pp. 1–24. IEEE (2015) Sodani, A.: Knights landing (KNL): 2nd generation Intel Xeon Phi processor. In: 2015 IEEE Hot Chips 27 Symposium (HCS), pp. 1–24. IEEE (2015)
23.
go back to reference Sodani, A., et al.: Knights landing: second-generation Intel Xeon Phi product. IEEE Micro 36(2), 34–46 (2016)CrossRef Sodani, A., et al.: Knights landing: second-generation Intel Xeon Phi product. IEEE Micro 36(2), 34–46 (2016)CrossRef
24.
go back to reference Tallent, N., Mellor-Crummey, J., Adhianto, L., Fagan, M., Krentel, M.: HPCToolkit: performance tools for scientific computing. In: Journal of Physics: Conference Series, vol. 125, p. 012088. IOP Publishing (2008) Tallent, N., Mellor-Crummey, J., Adhianto, L., Fagan, M., Krentel, M.: HPCToolkit: performance tools for scientific computing. In: Journal of Physics: Conference Series, vol. 125, p. 012088. IOP Publishing (2008)
25.
go back to reference Thompto, B.: POWER9: processor for the cognitive era. In: 2016 IEEE Hot Chips 28 Symposium (HCS), pp. 1–19. IEEE (2016) Thompto, B.: POWER9: processor for the cognitive era. In: 2016 IEEE Hot Chips 28 Symposium (HCS), pp. 1–19. IEEE (2016)
26.
go back to reference Zenker, E., et al.: Alpaka-an abstraction library for parallel kernel acceleration. In: 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, pp. 631–640. IEEE (2016) Zenker, E., et al.: Alpaka-an abstraction library for parallel kernel acceleration. In: 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, pp. 631–640. IEEE (2016)
Metadata
Title
Profiling and Debugging Support for the Kokkos Programming Model
Authors
Simon D. Hammond
Christian R. Trott
Daniel Ibanez
Daniel Sunderland
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-030-02465-9_53

Premium Partner