nach oben

Erschienen in:

2020 | OriginalPaper | Buchkapitel

17. Performance Characteristics for Sparse Matrix-Vector Multiplication on GPUs

verfasst von : Sarah AlAhmadi, Thaha Muhammed, Rashid Mehmood, Aiiad Albeshri

Erschienen in: Smart Infrastructure and Applications

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The massive parallelism provided by the graphics processing units (GPUs) offers tremendous performance in many high-performance computing applications. One such application is Sparse Matrix-Vector (SpMV) multiplication, which is an essential building block for numerous scientific and engineering applications. Researchers who propose new storage techniques for sparse matrix-vector multiplication focus mainly on a single evaluation metrics or performance characteristics which is usually the throughput performance of sparse matrix-vector multiplication in FLOPS. However, such an evaluation does not provide a deeper insight nor allow to compare new SpMV techniques with their competitors directly. In this chapter, we explain the notable performance characteristics of the GPU architectures and SpMV computations. We discuss various strategies to improve the performance of SpMV on GPUs. We also discuss a few performance criteria that are usually overlooked by the researchers during the evaluation process. We also analyze various well-known schemes such as COO, CSR, ELL, DIA, HYB, and CSR5 using the discussed performance characteristics.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Parallel Iterative Solution of Large Sparse Linear Equation Systems on the Intel MIC Architecture

Nächstes Kapitel HPC-Smart Infrastructures: A Review and Outlook on Performance Analysis Methods and Tools

Yang, W., Li, K., Li, K.: A hybrid computing method of SpMV on CPU–GPU heterogeneous computing systems. J. Parallel Distrib. Comput. 104, 49–60 (2017)CrossRef

Mehmood, R., Lu, J.A.: Computational Markovian analysis of large systems. J. Manuf. Technol. Manag. 22, 804–817 (2011)CrossRef

Mehmood, R., Meriton, R., Graham, G., Hennelly, P., Kumar, M.: Exploring the influence of big data on city transport operations: a Markovian approach. Int. J. Oper. Prod. Manag. 37, 75–104 (2017)CrossRef

Mehmood, R., Alturki, R., Zeadally, S.: Multimedia applications over metropolitan area networks (MANs). J. Netw. Comput. Appl. 34, 1518–1529 (2011)CrossRef

Mehmood, R., Graham, G.: Big data logistics: a health-care transport capacity sharing model. Proc. Comput. Sci. 64, 1107–1114 (2015)CrossRef

Altowaijri, S., Mehmood, R., Williams, J.: A quantitative model of grid systems performance in healthcare organisations. ISMS 2010—UKSim/AMSS 1st International Conference on Intelligent Systems. Model. Simul. 431–436 (2010)

Huan, G., Qian, Z.: A new method of sparse matrix-vector multiplication on GPU. In: International Conference on Computer Science and Network Technology, pp. 954–958 (2012)

Hassani, R., Fazely, A., Choudhury, R.-U.-A., Luksch, P.: Analysis of sparse matrix-vector multiplication using iterative method in CUDA. In: 2013 IEEE Eighth International Conference on Networking, Architecture and Storage, pp. 262–266 (2013)CrossRef

Cheik Ahamed, A.-K., Magoulès, F.: Efficient implementation of Jacobi iterative method for large sparse linear systems on graphic processing units. J. Supercomput. 73, 3411–3432 (2017)CrossRef

10.

Adhianto, L., Banerjee, S., Fagan, M., Krentel, M., Marin, G., Mellor-Crummey, J., Tallent, N.R.: HPCTOOLKIT: tools for performance analysis of optimized parallel programs. Concurr. Comput. Pract. Exp. 22, 685–701 (2010). http://hpctoolkit.org

11.

Brahme, D., Mishra, B.R., Barve, A.: Parallel sparse matrix vector multiplication using greedy extraction of boxes. In: 2010 International Conference on High Performance Computing, pp. 1–10 (2010)

12.

Ahamed, A.-K.C., Magoules, F.: Fast sparse matrix-vector multiplication on graphics processing unit for finite element analysis. In: 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems, pp. 1307–1314 (2012)CrossRef

13.

Guo, P., Wang, L., Chen, P.: A performance modeling and optimization analysis tool for sparse matrix-vector multiplication on GPUs. IEEE Trans. Parallel Distrib. Syst. 25, 1112–1123 (2014)CrossRef

14.

Guo, P., Wang, L.: Auto-tuning CUDA parameters for sparse matrix-vector multiplication on GPUs. In: Proceedings—2010 International Conference on Computational and Information Sciences, ICCIS 2010, pp. 1154–1157 (2010)CrossRef

15.

Merrill, D., Garland, M.: Merge-based parallel sparse matrix-vector multiplication. In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC 16, pp. 678–689 (2016)CrossRef

16.

Hou, K., Feng, W.C., Che, S.: Auto-tuning strategies for parallelizing sparse matrix-vector (SpMV) multiplication on multi- and many-core processors. In: Proceedings—2017 IEEE 31st International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017, pp. 713–722 (2017)

17.

Mehmood, R., Crowcroft, J.: Parallel Iterative Solution Method for Large Sparse Linear Equation Systems. UCAM-CL-TR-650. University of Cambridge, Computer Laboratory (2005)

18.

Mehmood, R.: Disk-Based Techniques for Efficient Solution of Large Markov Chains. Computer Science, University of Birmingham (2004)

19.

Barrett, R., Berry, M., Chan, T.F., Demmel, J., Donato, J., Dongarra, J., Eijkhout, V., Pozo, R., Romine, C., van der Vorst, H.: Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods. Society for Industrial and Applied Mathematics, Philadelphia (1994)CrossRef

20.

Eleliemy, A., Fayez, M., Mehmood, R., Katib, I., Aljohani, N.: Loadbalancing on parallel heterogeneous architectures: spin-image algorithm on CPU and MIC. In: 9th EUROSIM Congress on Modelling and Simulation. EUROSIM (2016)

21.

Kwiatkowska, M., Mehmood, R.: Out-of-Core solution of large linear Systems of Equations Arising from stochastic modelling. In: Process Algebra and Probabilistic Methods: Performance Modeling and Verification, pp. 135–151. Springer, Berlin (2002)CrossRef

22.

Kwiatkowska, M., Mehmood, R., Norman, G., Parker, D.: A symbolic out-of-core solution method for Markov models. Electr. Notes Theor. Comput. Sci. 68, 589–604 (2002)CrossRef

23.

Mehmood, R.: A Survey of Out-of-Core Analysis Techniques in Stochastic Modelling. University of Birmingham, UK (2003)

24.

Mehmood, R.: Serial disk-based analysis of large stochastic models. In: Baier, C., Haverkort, B.R., Hermanns, H., Katoen, J.-P., Siegle, M. (eds.) Validation of Stochastic Systems: A Guide to Current Research, pp. 230–255. Springer, Berlin (2004)CrossRef

25.

Mehmood, R., Crowcroft, J., Elmirghani, J.M.H.: A parallel implicit method for the steady-state solution of CTMCs. In: 14th IEEE International Symposium on Modeling, Analysis, and Simulation, pp. 293–302 (2006)CrossRef

26.

Mehmood, R., Parker, D., Kwiatkowska, M.: An Efficient BDD-Based Implementation of Gauss-Seidel for CTMC Analysis. University of Birmingham, UK (2003)

27.

Mehmood, R., Parker, D., Kwiatkowska, M.: An Efficient Symbolic Out-of-Core Solution Method for Markov Models., University of Birmingham, UK (2003)

28.

Magoulès, F., Ahamed, A.-K.C.: Alinea: an advanced linear algebra library for massively parallel computations on graphics processing units. Int. J. High Perform. Comput. Appl. 29, 284–310 (2015)CrossRef

29.

Muhammed, T., Mehmood, R., Albeshri, A., Katib, I.: UbeHealth: a personalized ubiquitous cloud and edge-enabled networked healthcare system for smart cities. IEEE Access. 6, 32258–32285 (2018)CrossRef

30.

Owens, J.D., Houston, M., Luebke, D., Green, S., Stone, J.E., Phillips, J.C.: GPU computing. Proc. IEEE. 879–899 (2008)

31.

Fevgas, A., Daloukas, K., Tsompanopoulou, P., Bozanis, P.: Efficient solution of large sparse linear systems in modern hardware. In: 2015 6th International Conference on Information, Intelligence, Systems and Applications (IISA), pp. 1–6 (2015)

32.

Kirk, D.B., Hwu, W.M.W.: Programming Massively Parallel Processors: A Hands-on Approach (2013)

33.

Cheng, J., Grossman, M., McKercher, T.: Professional CUDA C Programming. Wiley, New York (2014)

34.

NVIDIA: Pascal GPU Architecture | NVIDIA. https://www.nvidia.com/en-us/data-center/pascal-gpu-architecture

35.

NVIDIA: NVIDIA Tesla P100 Whitepaper (2016)

36.

NVIDIA: NVIDIA Tesla V100. https://www.nvidia.com/en-us/data-center/tesla-v100/?ncid=van-tesla-v100

37.

NVIDIA: History of NVIDIA – From Graphics Cards to Mobile Processors. http://www.nvidia.co.uk/object/corporate-timeline-uk.html

38.

Nvidia: Nvidia CUDA C Programming Guide Version 4.2 (2012)

39.

Bell, N., Garland, M.: Efficient Sparse Matrix-Vector Multiplication on CUDA (2008)

40.

Nvidia: Profiler User’s Guide

41.

Saad, Y.: SPARSKIT: A Basic Tool Kit for Sparse Matrix Computations Version 2 (1994)

42.

Bell, N., Garland, M.: Implementing sparse matrix-vector multiplication on throughput-oriented processors. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis—SC ’09, p. 1 (2009)

43.

Liu, W., Vinter, B.: CSR5: An Efficient Storage Format for Cross-Platform Sparse Matrix-Vector Multiplication. Arxiv, Ithaca, NY (2015)CrossRef

44.

Guo, P., Lee, C.W.: A performance prediction and analysis integrated framework for SpMV on GPUs. In: Procedia Computer Science, pp. 178–189. The Author(s), (2016)CrossRef

45.

Fujita, M., McGeer, P.C., Yang, J.C.-Y.: Multi-terminal binary decision diagrams: an efficient data structure for matrix representation. Formal Meth. Syst. Design. 10, 149–169 (1997)CrossRef

46.

Maggioni, M., Berger-Wolf, T.: Optimization techniques for sparse matrix-vector multiplication on GPUs. J. Parallel Distrib. Comput. 93-94, 66–86 (2016)CrossRef

47.

Filippone, S., Cardellini, V., Barbieri, D., Fanfarillo, A.: Sparse matrix-vector multiplication on GPGPUs. ACM Trans. Math. Softw. 43, 1–49 (2017)MathSciNetCrossRef

48.

Williams, S., Oliker, L., Vuduc, R., Shalf, J., Yelick, K., Demmel, J.: Optimization of sparse matrix-vector multiplication on emerging multicore platforms—long version. Parallel Comput. 35, 178–194 (2009)CrossRef

Titel: Performance Characteristics for Sparse Matrix-Vector Multiplication on GPUs
verfasst von: Sarah AlAhmadi
Thaha Muhammed
Rashid Mehmood
Aiiad Albeshri
Verlag: Springer International Publishing
Buch: Smart Infrastructure and Applications
Print ISBN: 978-3-030-13704-5

Electronic ISBN: 978-3-030-13705-2

Copyright-Jahr: 2020
DOI: https://doi.org/10.1007/978-3-030-13705-2_17

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"