Skip to main content

2015 | OriginalPaper | Buchkapitel

Roofline Model Toolkit: A Practical Tool for Architectural and Program Analysis

verfasst von : Yu Jung Lo, Samuel Williams, Brian Van Straalen, Terry J. Ligocki, Matthew J. Cordery, Nicholas J. Wright, Mary W. Hall, Leonid Oliker

Erschienen in: High Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We present preliminary results of the Roofline Toolkit for multicore, manycore, and accelerated architectures. This paper focuses on the processor architecture characterization engine, a collection of portable instrumented micro benchmarks implemented with Message Passing Interface (MPI), and OpenMP used to express thread-level parallelism. These benchmarks are specialized to quantify the behavior of different architectural features. Compared to previous work on performance characterization, these microbenchmarks focus on capturing the performance of each level of the memory hierarchy, along with thread-level parallelism, instruction-level parallelism and explicit SIMD parallelism, measured in the context of the compilers and run-time environments. We also measure sustained PCIe throughput with four GPU memory managed mechanisms. By combining results from the architecture characterization with the Roofline model based solely on architectural specifications, this work offers insights for performance prediction of current and future architectures and their software systems. To that end, we instrument three applications and plot their resultant performance on the corresponding Roofline model when run on a Blue Gene/Q architecture.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
GPU driver version: 331.89; CUDA toolkit version: 6.0beta.
 
Literatur
2.
Zurück zum Zitat Bailey, D.H., Lucas, R.F., Williams, S.W.: Performance Tuning of Scientific Applications. CRC Press, New York (2011)MATH Bailey, D.H., Lucas, R.F., Williams, S.W.: Performance Tuning of Scientific Applications. CRC Press, New York (2011)MATH
3.
Zurück zum Zitat Choi, J.W., Bedard, D., Fowler, R., Vuduc, R.: A roofline model of energy. In: IEEE IPDPS, May 2013 Choi, J.W., Bedard, D., Fowler, R., Vuduc, R.: A roofline model of energy. In: IEEE IPDPS, May 2013
5.
Zurück zum Zitat IBM Corporation: IBM system blue gene solution: Blue gene/q application development. IBM, June 2013 IBM Corporation: IBM system blue gene solution: Blue gene/q application development. IBM, June 2013
6.
Zurück zum Zitat Intel Corporation: Intel xeon phi corprocessor system softeare developers guide. Intel, June 2012 Intel Corporation: Intel xeon phi corprocessor system softeare developers guide. Intel, June 2012
7.
Zurück zum Zitat Nvidia Corporation: Kepler gk 110: The fatest, most efficient hpc architecture ever built. Nvidia v1.0 (2012) Nvidia Corporation: Kepler gk 110: The fatest, most efficient hpc architecture ever built. Nvidia v1.0 (2012)
8.
Zurück zum Zitat Nvidia Corporation: Cuda c programming guide. Nvidia PG-02819 v6.0, February 2014 Nvidia Corporation: Cuda c programming guide. Nvidia PG-02819 v6.0, February 2014
9.
Zurück zum Zitat Datta, K., Kamil, S., Williams, S., Oliker, L., Shalf, J., Katherine, Y.: Optimization and performance modeling of stencil computations on modern microprocessors. SIAM Rev. 51(1), 129–159 (2009)CrossRef Datta, K., Kamil, S., Williams, S., Oliker, L., Shalf, J., Katherine, Y.: Optimization and performance modeling of stencil computations on modern microprocessors. SIAM Rev. 51(1), 129–159 (2009)CrossRef
13.
Zurück zum Zitat Hager, G., Treibig, J., Habich, J., Wellein, G.: Exploring performance and power properties of modern multicore chips via simple machine models. CoRR abs/1208.2908 (2012) Hager, G., Treibig, J., Habich, J., Wellein, G.: Exploring performance and power properties of modern multicore chips via simple machine models. CoRR abs/1208.2908 (2012)
15.
Zurück zum Zitat Kamil, S., Husbands, P., Oliker, L., Shalf, J., Yelick, K.: Impact of modern memory subsystems on cache optimizations for stencil computations. In: ACM MSP (2005) Kamil, S., Husbands, P., Oliker, L., Shalf, J., Yelick, K.: Impact of modern memory subsystems on cache optimizations for stencil computations. In: ACM MSP (2005)
19.
Zurück zum Zitat Williams, S.: Auto-tuning performance on multicore computers. Ph.D. thesis, EECS Department, University of California, Berkeley, December 2008 Williams, S.: Auto-tuning performance on multicore computers. Ph.D. thesis, EECS Department, University of California, Berkeley, December 2008
20.
Zurück zum Zitat Williams, S., Watterman, A., Patterson, D.: Roofline: an insightful visual performance model for floating-point programs and multicore architectures. Commun. ACM 52(4), 65–76 (2009)CrossRef Williams, S., Watterman, A., Patterson, D.: Roofline: an insightful visual performance model for floating-point programs and multicore architectures. Commun. ACM 52(4), 65–76 (2009)CrossRef
Metadaten
Titel
Roofline Model Toolkit: A Practical Tool for Architectural and Program Analysis
verfasst von
Yu Jung Lo
Samuel Williams
Brian Van Straalen
Terry J. Ligocki
Matthew J. Cordery
Nicholas J. Wright
Mary W. Hall
Leonid Oliker
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-17248-4_7