nach oben

Erschienen in:

2017 | OriginalPaper | Buchkapitel

NVIDIA Jetson Platform Characterization

verfasst von : Hassan Halawa, Hazem A. Abdelhafez, Andrew Boktor, Matei Ripeanu

Erschienen in: Euro-Par 2017: Parallel Processing

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

This study characterizes the NVIDIA Jetson TK1 and TX1 Platforms, both built on a NVIDIA Tegra System on Chip and combining a quad-core ARM CPU and an NVIDIA GPU. Their heterogeneous nature, as well as their wide operating frequency range, make it hard for application developers to reason about performance and determine which optimizations are worth pursuing. This paper attempts to inform developers’ choices by characterizing the platforms’ performance using Roofline models obtained through an empirical measurement-based approach as well as through a case study of a heterogeneous application (matrix multiplication). Our results highlight a difference of more than an order of magnitude in compute performance between the CPU and GPU on both platforms. Given that the CPU and GPU share the same memory bus, their Roofline models’ balance points are also more than an order of magnitude apart. We also explore the impact of frequency scaling: build CPU and GPU Roofline profiles and characterize both platforms’ balance point variation, power consumption, and performance per watt as frequency is scaled.

The characterization we provide can be used in two main ways. First, given an application, it can inform the choice and number of processing elements to use (i.e., CPU/GPU and number of cores) as well as the optimizations likely to lead to high performance gains. Secondly, this characterization indicates that developers can use frequency scaling to tune the Jetson Platform to suit the requirements of their applications. Third, given a required power/performance budget, application developers can identify the appropriate parameters to use to tune the Jetson platforms to their specific workload requirements. We expect that this optimization approach can lead to overall gains in performance and/or power efficiency without requiring application changes.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Performance Characterization of De Novo Genome Assembly on Leading Parallel Systems

Nächstes Kapitel Following the Blind Seer – Creating Better Performance Models Using Less Information

Our benchmarks are available online at: https://bitbucket.org/nsl_europar17/benchmarks.

We tried several alternative techniques such as using mprotect() which changes memory access permissions on a specific memory range. The NVIDIA driver locks the memory accessed by the GPU kernels until they complete. Therefore, it is not possible to have a shared matrix object accessed at the same time by the CPU and GPU even when we use UMA, even if all accesses are read-only.

NVIDIA CUDA toolkit v8.0: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#um-unified-memory-programming-hd. Accessed 16 Feb 2017

Watts-up: https://www.wattsupmeters.com/. Accessed 23 Aug 2016

Lo, Y.J., Williams, S., Van Straalen, B., Ligocki, T.J., Cordery, M.J., Wright, N.J., Hall, M.W., Oliker, L.: Roofline model toolkit: a practical tool for architectural and program analysis. In: Jarvis, S.A., Wright, S.A., Hammond, S.D. (eds.) PMBS 2014. LNCS, vol. 8966, pp. 129–148. Springer, Cham (2015). doi:10.1007/978-3-319-17248-4_7

NVIDIA: Technical brief NVIDIA Jetson TK1 development kit: bringing GPU-accelerated computing to embedded systems. Technical report, April 2014

NVIDIA: Tegra X1: NVIDIA’s new mobile superchip. Technical report, January 2015

NVIDIA: CUBLAS library. Technical report, September 2016

Ofenbeck, G., et al.: Applying the Roofline model. In: ISPASS 2014, pp. 76–85, March 2014

Paolucci, P.S., et al.: Power, energy and speed of embedded and server multi-cores applied to distributed simulation of spiking neural networks: ARM in NVIDIA Tegra vs Intel Xeon quad-cores. CoRR abs/1505.03015 (2015)

Ukidave, Y., et al.: Performance of the NVIDIA Jetson TK1 in HPC. In: 2015 IEEE International Conference on Cluster Computing (CLUSTER), pp. 533–534, September 2015

10.

Williams, S., et al.: Roofline: an insightful visual performance model for multicore architectures. Commun. ACM 52(4), 65–76 (2009)CrossRef

11.

Wong, H., et al.: Demystifying GPU microarchitecture through microbenchmarking. In: ISPASS 2010, pp. 235–246. IEEE (2010)

Titel: NVIDIA Jetson Platform Characterization
verfasst von: Hassan Halawa
Hazem A. Abdelhafez
Andrew Boktor
Matei Ripeanu
Verlag: Springer International Publishing
Buch: Euro-Par 2017: Parallel Processing
Print ISBN: 978-3-319-64202-4

Electronic ISBN: 978-3-319-64203-1

Copyright-Jahr: 2017
DOI: https://doi.org/10.1007/978-3-319-64203-1_7

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"