Skip to main content
Erschienen in: The Journal of Supercomputing 11/2020

04.02.2020

Performance modeling of the sparse matrix–vector product via convolutional neural networks

verfasst von: Maria Barreda, Manuel F. Dolz, M. Asunción Castaño, Pedro Alonso-Jordá, Enrique S. Quintana-Ortí

Erschienen in: The Journal of Supercomputing | Ausgabe 11/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Modeling the execution time of the sparse matrix–vector multiplication (SpMV) on a current CPU architecture is especially complex due to (i) irregular memory accesses; (ii) indirect memory referencing; and (iii) low arithmetic intensity. While analytical models may yield accurate estimates for the total number of cache hits/misses, they often fail to predict accurately the total execution time. In this paper, we depart from the analytic approach to instead leverage convolutional neural networks (CNNs) in order to provide an effective estimation of the performance of the SpMV operation. For this purpose, we present a high-level abstraction of the sparsity pattern of the problem matrix and propose a blockwise strategy to feed the CNN models by blocks of nonzero elements. The experimental evaluation on a representative subset of the matrices from the SuiteSparse Matrix collection demonstrates the robustness of the CNN models for predicting the SpMV performance on an Intel Haswell core. Furthermore, we show how to generalize the network models to other target architectures to estimate the performance of SpMV on an ARM A57 core.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Fußnoten
1
The arithmetic intensity is defined as the ratio of total floating-point operations to total data movement (in bytes).
 
Literatur
1.
Zurück zum Zitat Abdelfattah A, Ltaief H, Keyes D (2015) High performance multi-GPU SpMV for multi-component PDE-based applications. In: Träff JL, Hunold S, Versaci F (eds) Euro-Par 2015: parallel processing. Springer, Berlin, pp 601–612CrossRef Abdelfattah A, Ltaief H, Keyes D (2015) High performance multi-GPU SpMV for multi-component PDE-based applications. In: Träff JL, Hunold S, Versaci F (eds) Euro-Par 2015: parallel processing. Springer, Berlin, pp 601–612CrossRef
2.
Zurück zum Zitat Schiesser WE (2014) Computational mathematics in engineering and applied science: ODEs, DAEs, and PDEs. CRC Press, Boca RatonCrossRef Schiesser WE (2014) Computational mathematics in engineering and applied science: ODEs, DAEs, and PDEs. CRC Press, Boca RatonCrossRef
3.
Zurück zum Zitat Vuduc R, Demmel JW, Yelick KA (2005) OSKI: a library of automatically tuned sparse matrix kernels. J Phys Conf Ser 16:521–530CrossRef Vuduc R, Demmel JW, Yelick KA (2005) OSKI: a library of automatically tuned sparse matrix kernels. J Phys Conf Ser 16:521–530CrossRef
4.
Zurück zum Zitat Williams S, Oliker L, Vuduc R, Shalf J, Yelick K, Demmel J (2007) Optimization of sparse matrix–vector multiplication on emerging multicore platforms. In: SC ’07: Proceedings of the 2007 ACM/IEEE Conference on Supercomputing, pp 1–12 Williams S, Oliker L, Vuduc R, Shalf J, Yelick K, Demmel J (2007) Optimization of sparse matrix–vector multiplication on emerging multicore platforms. In: SC ’07: Proceedings of the 2007 ACM/IEEE Conference on Supercomputing, pp 1–12
5.
Zurück zum Zitat Elafrou A, Goumas G, Koziris N (2017) Performance analysis and optimization of sparse matrix–vector multiplication on modern multi- and many-core processors. In: 2017 46th International Conference on Parallel Processing (ICPP), pp 292–301 Elafrou A, Goumas G, Koziris N (2017) Performance analysis and optimization of sparse matrix–vector multiplication on modern multi- and many-core processors. In: 2017 46th International Conference on Parallel Processing (ICPP), pp 292–301
6.
Zurück zum Zitat Li S, Chang H, Zhang J, Zhang Y (2015) Automatic tuning of sparse matrix–vector multiplication on multicore clusters. Sci China Inf Sci 58(9):1–14 Li S, Chang H, Zhang J, Zhang Y (2015) Automatic tuning of sparse matrix–vector multiplication on multicore clusters. Sci China Inf Sci 58(9):1–14
7.
Zurück zum Zitat Guo P, Wang L (2015) Accurate cross-architecture performance modeling for sparse matri–vector multiplication (SpMV) on GPUs. Concurr Comput Pract Exp 27(13):3281–3294CrossRef Guo P, Wang L (2015) Accurate cross-architecture performance modeling for sparse matri–vector multiplication (SpMV) on GPUs. Concurr Comput Pract Exp 27(13):3281–3294CrossRef
8.
Zurück zum Zitat Li K, Yang W, Li K (2015) Performance analysis and optimization for SpMV on GPU using probabilistic modeling. IEEE Trans Parallel Distrib Syst 26(1):196–205MathSciNetCrossRef Li K, Yang W, Li K (2015) Performance analysis and optimization for SpMV on GPU using probabilistic modeling. IEEE Trans Parallel Distrib Syst 26(1):196–205MathSciNetCrossRef
9.
Zurück zum Zitat Eijkhout V, Pozo R (1994) Data structures and algorithms for distributed sparse matrix operations. Technical report Eijkhout V, Pozo R (1994) Data structures and algorithms for distributed sparse matrix operations. Technical report
10.
Zurück zum Zitat Gu J, Wang Z, Kuen J, Ma L, Shahroudy A, Shuai B, Liu T, Wang X, Wang G, Cai J, Chen T (2018) Recent advances in convolutional neural networks. Pattern Recognit 77(C):354–377CrossRef Gu J, Wang Z, Kuen J, Ma L, Shahroudy A, Shuai B, Liu T, Wang X, Wang G, Cai J, Chen T (2018) Recent advances in convolutional neural networks. Pattern Recognit 77(C):354–377CrossRef
11.
Zurück zum Zitat Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Gordon G, Dunson D, Dudík M (eds) Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, volume 15 of Proceedings of Machine Learning Research. Fort Lauderdale, FL, USA, 11–13. PMLR, pp 315–323 Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Gordon G, Dunson D, Dudík M (eds) Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, volume 15 of Proceedings of Machine Learning Research. Fort Lauderdale, FL, USA, 11–13. PMLR, pp 315–323
12.
Zurück zum Zitat Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on International Conference on Machine Learning, Volume 37 (ICML’15). JMLR org, pp 448–456 Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on International Conference on Machine Learning, Volume 37 (ICML’15). JMLR org, pp 448–456
17.
Zurück zum Zitat Bergstra J, Yamins D, Cox DD (2013) Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. In: Proceedings of the 30th International Conference on International Conference on Machine Learning—Volume 28, ICML’13. JMLR.org, pp I–115–I–123 Bergstra J, Yamins D, Cox DD (2013) Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. In: Proceedings of the 30th International Conference on International Conference on Machine Learning—Volume 28, ICML’13. JMLR.org, pp I–115–I–123
19.
Zurück zum Zitat Bishop CM (2006) Pattern recognition and machine learning (information science and statistics). Springer, BerlinMATH Bishop CM (2006) Pattern recognition and machine learning (information science and statistics). Springer, BerlinMATH
20.
Zurück zum Zitat Pan SJ, Yang Qiang (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359CrossRef Pan SJ, Yang Qiang (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359CrossRef
21.
Zurück zum Zitat Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117CrossRef Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117CrossRef
22.
Zurück zum Zitat LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–44 05CrossRef LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–44 05CrossRef
23.
Zurück zum Zitat Götz M, Anzt H (2018) Machine learning-aided numerical linear algebra: convolutional neural networks for the efficient preconditioner generation. In: Procs of ScalA’18: 9th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, WS at Supercomputing 2018, 11 Götz M, Anzt H (2018) Machine learning-aided numerical linear algebra: convolutional neural networks for the efficient preconditioner generation. In: Procs of ScalA’18: 9th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, WS at Supercomputing 2018, 11
24.
Zurück zum Zitat Zhao Y, Li J, Liao C, Shen X (2018) Bridging the gap between deep learning and sparse matrix format selection. SIGPLAN Not 53(1):94–108CrossRef Zhao Y, Li J, Liao C, Shen X (2018) Bridging the gap between deep learning and sparse matrix format selection. SIGPLAN Not 53(1):94–108CrossRef
25.
Zurück zum Zitat Cui H, Hirasawa S, Kobayashi H, Takizawa H (2018) A machine learning-based approach for selecting SpMV kernels and matrix storage formats. IEICE Trans Inf Syst E101.D(9):2307–2314CrossRef Cui H, Hirasawa S, Kobayashi H, Takizawa H (2018) A machine learning-based approach for selecting SpMV kernels and matrix storage formats. IEICE Trans Inf Syst E101.D(9):2307–2314CrossRef
26.
Zurück zum Zitat Nisa I, Siegel C, Rajam AS, Vishnu A, Sadayappan P (2018) Effective machine learning based format selection and performance modeling for SpMV on GPUs. EasyChair Preprint no. 388, EasyChair Nisa I, Siegel C, Rajam AS, Vishnu A, Sadayappan P (2018) Effective machine learning based format selection and performance modeling for SpMV on GPUs. EasyChair Preprint no. 388, EasyChair
27.
Zurück zum Zitat Tiwari A, Laurenzano MA, Carrington L, Snavely A (2012) Modeling power and energy usage of HPC kernels. In: 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops PhD Forum, pp 990–998 Tiwari A, Laurenzano MA, Carrington L, Snavely A (2012) Modeling power and energy usage of HPC kernels. In: 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops PhD Forum, pp 990–998
28.
Zurück zum Zitat Benatia A, Ji W, Wang Y, Shi F (2016) Machine learning approach for the predicting performance of SpMV on GPU. In: 2016 IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS), pp 894–901 Benatia A, Ji W, Wang Y, Shi F (2016) Machine learning approach for the predicting performance of SpMV on GPU. In: 2016 IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS), pp 894–901
Metadaten
Titel
Performance modeling of the sparse matrix–vector product via convolutional neural networks
verfasst von
Maria Barreda
Manuel F. Dolz
M. Asunción Castaño
Pedro Alonso-Jordá
Enrique S. Quintana-Ortí
Publikationsdatum
04.02.2020
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 11/2020
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-020-03186-1

Weitere Artikel der Ausgabe 11/2020

The Journal of Supercomputing 11/2020 Zur Ausgabe