Top

The Journal of Supercomputing

Published in:

30-11-2020

DIESEL: A novel deep learning-based tool for SpMV computations and solving sparse linear equation systems

Authors: Thaha Mohammed, Aiiad Albeshri, Iyad Katib, Rashid Mehmood

Published in: The Journal of Supercomputing | Issue 6/2021

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Sparse linear algebra is central to many areas of engineering, science, and business. The community has done considerable work on proposing new methods for sparse matrix-vector multiplication (SpMV) computations and iterative sparse solvers on graphical processing units (GPUs). Due to vast variations in matrix features, no single method performs well across all sparse matrices. A few tools on automatic prediction of best-performing SpMV kernels have emerged recently and require many more efforts to fully utilize their potential. The utilization of a GPU by the existing SpMV kernels is far from its full capacity. Moreover, the development and performance analysis of SpMV techniques on GPUs have not been studied in sufficient depth. This paper proposes DIESEL, a deep learning-based tool that predicts and executes the best performing SpMV kernel for a given matrix using a feature set carefully devised by us through rigorous empirical and mathematical instruments. The dataset comprises 1056 matrices from 26 different real-life application domains including computational fluid dynamics, materials, electromagnetics, economics, and more. We propose a range of new metrics and methods for performance analysis, visualization, and comparison of SpMV tools. DIESEL provides better performance with its accuracy \(88.2\%\), workload accuracy \(91.96\%\), and average relative loss \(4.4\%\), compared to \(85.9\%\), \(85.31\%\), and \(7.65\%\) by the next best performing artificial intelligence (AI)-based SpMV tool. The extensive results and analyses presented in this paper provide several key insights into the performance of the SpMV tools and how these relate to the matrix datasets and the performance metrics, allowing the community to further improve and compare basic and AI-based SpMV tools in the future.

previous article Performance comparison of multi-container deployment schemes for HPC workloads: an empirical study

next article Forecasting peak energy demand for smart buildings

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

AlAhmadi S, Muhammed T, Mehmood R, Albeshri A (2020) Performance characteristics for sparse matrix-vector multiplication on GPUs. Springer International Publishing, Cham, pp 409–426. https://doi.org/10.1007/978-3-030-13705-2_17

Alyahya H, Mehmood R, Katib I (2020) Parallel iterative solution of large sparse linear equation systems on the intel MIC architecture. Springer International Publishing, Cham, pp 377–407. https://doi.org/10.1007/978-3-030-13705-2_16

Asanovic K, Bodik R, Catanzaro BC, Gebis JJ, Husbands P, Keutzer K, Patterson DA, Plishker WL, Shalf J, Williams SW, Yelick KA (2006) The landscape of parallel computing research: a view from Berkeley. Tech. Rep. UCB/EECS-2006-183, EECS Department, University of California, Berkeley, http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.html

Baskaran MM, Bordawekar R (2009) Optimizing sparse matrix-vector multiplication on GPUs. Tech. Rep. RC24704 (W0812-047), IBM Research

Bell N, Garland M (2008) Efficient sparse matrix-vector multiplication on CUDA. Tech. rep., Nvidia Technical Report NVR-2008-004, Nvidia Corporation

Benatia A, Ji W, Wang Y, Shi F (2016) Sparse matrix format selection with multiclass SVM for SpMV on GPU. In: 2016 45th International Conference on Parallel Processing (ICPP), pp 496–505. https://doi.org/10.1109/ICPP.2016.64

Benatia A, Ji W, Wang Y, Shi F (2018) Bestsf: a sparse meta-format for optimizing SpMV on GPU. ACM Trans Archit Code Optim 15(3). https://doi.org/10.1145/3226228

Bengio Y (2009) Learning deep architectures for ai. Found Trends Mach Learn 2(1):1–127. https://doi.org/10.1561/2200000006MathSciNetCrossRefMATH

Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828. https://doi.org/10.1109/TPAMI.2013.50CrossRef

10.

Bernaschi M, Bisson M, Fantozzi C, Janna C (2016) A factored sparse approximate inverse preconditioned conjugate gradient solver on graphics processing units. SIAM J Sci Comput 38(1):C53–C72. https://doi.org/10.1137/15M1027826MathSciNetCrossRefMATH

11.

Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2:27:1–27:27, software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm

12.

Choi JW, Singh A, Vuduc RW (2010) Model-driven autotuning of sparse matrix-vector multiply on GPUs. In: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Association for Computing Machinery, New York, NY, USA, PPoPP ’10, pp 115 – 126. https://doi.org/10.1145/1693453.1693471

13.

Davis TA, Hu Y (2011) The university of Florida sparse matrix collection. ACM Trans Math Softw 38(1):1:1–1:25. https://doi.org/10.1145/2049662.2049663

14.

Dhar S, Guo J, Liu J, Tripathi S, Kurup U, Shah M (2020) On-device machine learning: an algorithms and learning theory perspective. 1911.00623

15.

Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, Thrun S (2017) Dermatologist-level classification of skin cancer with deep neural networks. Nature 542(7639):115–118. http://dx.doi.org/10.1038/nature21056, letter

16.

Filippone S, Cardellini V, Barbieri D, Fanfarillo A (2017) Sparse matrix-vector multiplication on GPGPUs. ACM Trans Math Softw 43(4):1–49. https://doi.org/10.1145/3017994MathSciNetCrossRefMATH

17.

Golub GH, Van Loan CF (2012) Matrix computations, vol 3. JHU Press

18.

Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press, http://www.deeplearningbook.org

19.

Grimes RG, Kincaid DR, Young DM (1979) ITPACK 2.0 user’s guide. Center for Numerical Analysis, The University of Texas at Austin

20.

Grossman M, Thiele C, Araya-Polo M, Frank F, Alpak FO, Sarkar V (2016) A survey of sparse matrix-vector multiplication performance on large matrices. ArXiv abs/1608.00636

21.

Guo P, Wang L, Chen P (2014) A performance modeling and optimization analysis tool for sparse matrix-vector multiplication on GPUs. IEEE Trans Parallel Distrib Syst 25(5):1112–1123. https://doi.org/10.1109/TPDS.2013.123CrossRef

22.

Janna C, Ferronato M, Gambolati G (2015) The use of supernodes in factored sparse approximate inverse preconditioning. SIAM J Sci Comput 37(1):C72–C94. https://doi.org/10.1137/140956026MathSciNetCrossRefMATH

23.

Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. CoRR abs/1412.6980,

24.

Kirk DB, Wen-Mei WH (2016) Programming massively parallel processors: a hands-on approach. Morgan kaufmann

25.

Kreutzer M, Hager G, Wellein G, Fehske H, Basermann A, Bishop AR (2012) Sparse matrix-vector multiplication on GPGPU clusters: a new storage format and a scalable implementation. In: Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2012 IEEE 26th International, IEEE, pp 1696–1702

26.

Li K, Yang W, Li K (2015) Performance analysis and optimization for SpMV on GPU using probabilistic modeling. IEEE Trans Parallel Distrib Syst 26(1):196–205. https://doi.org/10.1109/TPDS.2014.2308221CrossRef

27.

Li R, Saad Y (2013) GPU-accelerated preconditioned iterative linear solvers. J Supercomput 63(2):443–466. https://doi.org/10.1007/s11227-012-0825-3CrossRef

28.

van der Maaten L, Hinton G (2012) Visualizing non-metric similarities in multiple maps. Mach Learn 87(1):33–55. https://doi.org/10.1007/s10994-011-5273-4MathSciNetCrossRefMATH

29.

Maaten Lvd, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(Nov):2579–2605

30.

Mehmood R, Crowcroft J (2005) Parallel iterative solution method for large sparse linear equation systems. University of Cambridge, Computer Laboratory

31.

Mohammed T (2017) A novel deep learning based iterative solver for large sparse linear equation systems. Master’s thesis, King Abdulaziz University. https://kaupp.sa/Details/Thesis/133000

32.

Muhammed T, Mehmood R, Albeshri A, Katib I (2019) SURAA: a novel method and tool for loadbalanced and coalesced SpMV computations on GPUs. Appl Sci 9(5):947. https://doi.org/10.3390/app9050947CrossRef

33.

Nisa I, Siegel C, Rajam AS, Vishnu A, Sadayappan P (2018) Effective machine learning based format selection and performance modeling for SpMV on GPUs. In: 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp 1056–1065. https://doi.org/10.1109/IPDPSW.2018.00164

34.

Saad Y, van der Vorst HA (2000) Iterative solution of linear systems in the 20th century. J Comput Appl Math 123(1–2):1–33. https://doi.org/10.1016/S0377-0427(00)00412-X, http://www.sciencedirect.com/science/article/pii/ S037704270000412X, numerical Analysis 2000. Vol. III: Linear Algebra

35.

Sedaghati N, Mu T, Pouchet LN, Parthasarathy S, Sadayappan P (2015) Automatic selection of sparse matrix representation on GPUs. In: Proceedings of the 29th ACM on International Conference on Supercomputing, ACM, New York, NY, USA, ICS ’15, pp 99–108. https://doi.org/10.1145/2751205.2751244

36.

Tan G, Liu J, Li J (2018) Design and implementation of adaptive SpMV library for multicore and many-core architecture. ACM Trans Math Softw 44(4). https://doi.org/10.1145/3218823

37.

Usman S, Mehmood R, Katib I, Albeshri A (2019a) ZAKI+: a machine learning based process mapping tool for SpMV computations on distributed memory architectures. IEEE Access 7:81279–81296. https://doi.org/10.1109/ACCESS.2019.2923565CrossRef

38.

Usman S, Mehmood R, Katib I, Albeshri A, Altowaijri S (2019b) ZAKI: a smart method and tool for automatic performance optimization of parallel SpMV computations on distributed memory machines. Mobile Netw Appl

39.

Usman S, Mehmood R, Katib I (2020) Big data and HPC convergence for smart infrastructures: a review and proposed architecture. Springer International Publishing, Cham, pp 561–586. https://doi.org/10.1007/978-3-030-13705-2_23

40.

Verschoor M, Jalba AC (2012) Analysis and performance estimation of the Conjugate Gradient method on multiple GPUs. Parallel Comput 38(10–11):552–575. https://doi.org/10.1016/j.parco.2012.07.002, http://www.sciencedirect.com/science/article/pii/ S0167819112000609

41.

Zardoshti P, Khunjush F, Sarbazi-Azad H (2015) Adaptive sparse matrix representation for efficient matrix–vector multiplication. J Supercomput pp 1–21

Title: DIESEL: A novel deep learning-based tool for SpMV computations and solving sparse linear equation systems
Authors: Thaha Mohammed
Aiiad Albeshri
Iyad Katib
Rashid Mehmood
Publication date: 30-11-2020
Publisher: Springer US
Published in: The Journal of Supercomputing / Issue 6/2021
Print ISSN: 0920-8542
Electronic ISSN: 1573-0484
DOI: https://doi.org/10.1007/s11227-020-03489-3

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Other articles of this Issue 6/2021

Highly efficient key agreement for remote patient monitoring in MEC-enabled 5G networks

Video reasoning for conflict events through feature extraction

Design and implementation of multiplication algorithm in quantum-dot cellular automata with energy dissipation analysis

Performance and power consumption analysis of Arm Scalable Vector Extension

A new deep intuitionistic fuzzy time series forecasting method based on long short-term memory

Korean language math-to-speech rules for digital books for people with reading disabilities and their usability evaluation

Premium Partner