Skip to main content
Erschienen in: The Journal of Supercomputing 3/2020

23.04.2019

Scalability analysis of AVX-512 extensions

verfasst von: Juan M. Cebrian, Lasse Natvig, Magnus Jahre

Erschienen in: The Journal of Supercomputing | Ausgabe 3/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Energy efficiency below a specific thermal design power (TDP) has become the main design goal for microprocessors across all market segments. Optimizing the usage of the available transistors within the TDP is a pending topic. Parallelism is the basic foundation for achieving the exascale level. While instruction-level and thread-level parallelism are embraced by developers, data-level parallelism is usually underutilized, despite its huge potential (e.g. single-instruction multiple-data execution). Companies are pushing the size of vector registers to double every 4 years. Intel’s AVX-512 (512-bit registers) and ARM’s SVE (up to 2048-bit registers) are examples of such trend. In this paper, we perform a scalability and energy efficiency analysis of AVX-512 using the ParVec benchmark suite. ParVec is extended to add support for AVX-512 as well as the newest versions of the GCC compiler . We use Intel’s Top–Down model to show the main bottlenecks of the architecture for each studied benchmark. Results show that the performance and energy improvements depend greatly on the fraction of code that can be vectorized . Energy improvements over scalar codes in a single-thread environment range from 2\(\times \) for Streamcluster (worst) to 8\(\times \) for Blackscholes (best).

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Fußnoten
1
Micro-operations.
 
2
Set of macros used to generate the intrinsics code using the C pre-processor.
 
3
Model Specific Registers.
 
Literatur
1.
Zurück zum Zitat Asanovi\(\grave{{\rm c}}\) K (1998) Vector microprocessors. Ph.D. thesis Asanovi\(\grave{{\rm c}}\) K (1998) Vector microprocessors. Ph.D. thesis
2.
Zurück zum Zitat Barnes GH, Brown RM, Kato M, Kuck DJ, Slotnick DL, Stokes RA (1968) The ILLIAC IV computer. IEEE Trans Comput C–17(8):746–757CrossRef Barnes GH, Brown RM, Kato M, Kuck DJ, Slotnick DL, Stokes RA (1968) The ILLIAC IV computer. IEEE Trans Comput C–17(8):746–757CrossRef
3.
Zurück zum Zitat Bienia C (2011) Benchmarking modern multiprocessors. Ph.D. thesis, Princeton University Bienia C (2011) Benchmarking modern multiprocessors. Ph.D. thesis, Princeton University
5.
Zurück zum Zitat Cebrian JM, Jahre M, Natvig L (2015) Parvec: vectorizing the parsec benchmark suite. Computing 97:1077–1100MathSciNetCrossRef Cebrian JM, Jahre M, Natvig L (2015) Parvec: vectorizing the parsec benchmark suite. Computing 97:1077–1100MathSciNetCrossRef
10.
Zurück zum Zitat Espasa R, Valero M, Smith JE (1998) Vector architectures : past, present and future. In: Proceeding ICS ’98 Proceedings of the 12th International Conference on Supercomputing, pp 425–432 Espasa R, Valero M, Smith JE (1998) Vector architectures : past, present and future. In: Proceeding ICS ’98 Proceedings of the 12th International Conference on Supercomputing, pp 425–432
11.
Zurück zum Zitat Ferdman M, Adileh A, Kocberber O, Volos S, Alisafaee M, Jevdjic D, Kaynak C, Popescu AD, Ailamaki A, Falsafi B (2012) Clearing the clouds: a study of emerging scale-out workloads on modern hardware. In: 17th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) Ferdman M, Adileh A, Kocberber O, Volos S, Alisafaee M, Jevdjic D, Kaynak C, Popescu AD, Ailamaki A, Falsafi B (2012) Clearing the clouds: a study of emerging scale-out workloads on modern hardware. In: 17th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS)
12.
Zurück zum Zitat Fuller S (1998) Motorola AltiVec technology. Motorola, Austin Fuller S (1998) Motorola AltiVec technology. Motorola, Austin
13.
Zurück zum Zitat Hennessy JL, Patterson DA (2006) Computer architecture, fourth edition: a quantitative approach. Morgan Kaufmann Publishers Inc., San FranciscoMATH Hennessy JL, Patterson DA (2006) Computer architecture, fourth edition: a quantitative approach. Morgan Kaufmann Publishers Inc., San FranciscoMATH
17.
Zurück zum Zitat Li M, Sasanka R, Adve S.V, kuang Chen Y, Debes E (2005) The alpbench benchmark suite. In: In Proceedings of the IEEE International Symposium on Workload Characterization Li M, Sasanka R, Adve S.V, kuang Chen Y, Debes E (2005) The alpbench benchmark suite. In: In Proceedings of the IEEE International Symposium on Workload Characterization
19.
Zurück zum Zitat Mucci PJ, Browne S, Deane C, Ho G (1999) PAPI: a portable interface to hardware performance counters. In: Proceedings of the Department of Defense HPCMP Users Group Conference Mucci PJ, Browne S, Deane C, Ho G (1999) PAPI: a portable interface to hardware performance counters. In: Proceedings of the Department of Defense HPCMP Users Group Conference
22.
Zurück zum Zitat Ren B, Jo Y, Krishnamoorthy S, Agrawal K, Kulkarni M (2015) Efficient execution of recursive programs on commodity vector hardware. In: ACM SIGPLAN notices, vol 50. ACM, pp 509–520 Ren B, Jo Y, Krishnamoorthy S, Agrawal K, Kulkarni M (2015) Efficient execution of recursive programs on commodity vector hardware. In: ACM SIGPLAN notices, vol 50. ACM, pp 509–520
26.
Zurück zum Zitat Satish N, Kim C, Chhugani J, Saito H, Krishnaiyer R, Smelyanskiy M, Girkar M, Dubey P (2012) Can traditional programming bridge the ninja performance gap for parallel computing applications? In: Proceedings of the 39th Annual International Symposium on Computer Architecture (ISCA), pp 440–451 Satish N, Kim C, Chhugani J, Saito H, Krishnaiyer R, Smelyanskiy M, Girkar M, Dubey P (2012) Can traditional programming bridge the ninja performance gap for parallel computing applications? In: Proceedings of the 39th Annual International Symposium on Computer Architecture (ISCA), pp 440–451
28.
Zurück zum Zitat Stephens N, Biles S, Boettcher M, Eapen J, Eyole M, Gabrielli G, Horsnell M, Magklis G, Martinez A, Premillieu N, Reid A, Rico A, Walker P (2017) The ARM scalable vector extension. IEEE Micro 37(2):26–39CrossRef Stephens N, Biles S, Boettcher M, Eapen J, Eyole M, Gabrielli G, Horsnell M, Magklis G, Martinez A, Premillieu N, Reid A, Rico A, Walker P (2017) The ARM scalable vector extension. IEEE Micro 37(2):26–39CrossRef
29.
Zurück zum Zitat Watson WJ (1972) The TI ASC: a highly modular and flexible super computer architecture. In: Proceedings of the December 5–7, 1972, Fall Joint Computer Conference, Part I (AFIPS), pp 221–228 Watson WJ (1972) The TI ASC: a highly modular and flexible super computer architecture. In: Proceedings of the December 5–7, 1972, Fall Joint Computer Conference, Part I (AFIPS), pp 221–228
Metadaten
Titel
Scalability analysis of AVX-512 extensions
verfasst von
Juan M. Cebrian
Lasse Natvig
Magnus Jahre
Publikationsdatum
23.04.2019
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 3/2020
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-019-02840-7

Weitere Artikel der Ausgabe 3/2020

The Journal of Supercomputing 3/2020 Zur Ausgabe