nach oben

Erschienen in:

2014 | OriginalPaper | Buchkapitel

Performance Evaluation of Sparse Matrix Multiplication Kernels on Intel Xeon Phi

verfasst von : Erik Saule, Kamer Kaya, Ümit V. Çatalyürek

Erschienen in: Parallel Processing and Applied Mathematics

Verlag: Springer Berlin Heidelberg

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Intel Xeon Phi is a recently released high-performance coprocessor which features 61 cores each supporting 4 hardware threads with 512-bit wide SIMD registers achieving a peak theoretical performance of 1Tflop/s in double precision. Its design differs from classical modern processors; it comes with a large number of cores, the 4-way hyperthreading capability allows many applications to saturate the massive memory bandwidth, and its large SIMD capabilities allow to reach high computation throughput. The core of many scientific applications involves the multiplication of a large, sparse matrix with a single or multiple dense vectors which are not compute-bound but memory-bound. In this paper, we investigate the performance of the Xeon Phi coprocessor for these sparse linear algebra kernels. We highlight the important hardware details and show that Xeon Phi’s sparse kernel performance is very promising and even better than that of cutting-edge CPUs and GPUs.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel The Regular Expression Matching Algorithm for the Energy Efficient Reconfigurable SoC

Nächstes Kapitel Portable HPC Programming on Intel Many-Integrated-Core Hardware with MAGMA Port to Xeon Phi

http://www.cise.ufl.edu/research/sparse/matrices/

Bell, N., Garland, M.: Implementing sparse matrix-vector multiplication on throughput-oriented processors. In: Proceedings of the High Performance Computing Networking, Storage and Analysis, SC ’09 (2009)

Buluç, A., Fineman, J.T., Frigo, M., Gilbert, J.R., Leiserson, C.E.: Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks. In: Proceedings of the SPAA ’09, pp. 233–244 (2009)

Buluç, A., Williams, S., Oliker, L., Demmel, J.: Reduced-bandwidth multithreaded algorithms for sparse matrix-vector multiplication. In: Proceedings of the IPDPS (2011)

Cramer, T., Schmidl, D., Klemm, M., an Mey, D.: Openmp programming on intel xeon phi coprocessors: an early performance comparison. In: Proceedings of the Many-core Applications Research Community (MARC) Symposium at RWTH Aachen University, November 2012

Cuthill, E., McKee, J.: Reducing the bandwidth of sparse symmetric matrices. In: Proceedings of the ACM National Conference, pp. 157–172 (1969)

Eisenlor, J., Hudak, D.E., Tomko, K., Prince, T.C.: Dense linear algebra factorization in OpenMP and Cilk Plus on Intel MIC: development experiences and performance analysis. In: TACC-Intel Highly Parallel Computing Symposium (2012)

Im, E.-J., Yelick, K.A.: Optimizing sparse matrix computations for register reuse in SPARSITY. In: Alexandrov, V.N., Dongarra, J., Juliano, B.A., Renner, R.S., Tan, C.J.K. (eds.) ICCS 2001. LNCS, vol. 2073, pp. 127–136. Springer, Heidelberg (2001) CrossRef

Jain, A.: pOSKI: an extensible autotuning framework to perform optimized spmvs on multicore architecture. Master’s thesis, UC Berkeley (2008)

Krotkiewski, M., Dabrowski, M.: Parallel symmetric sparse matrix-vector product on scalar multi-core CPUs. Parallel Comput. 36(4), 181–198 (2010)CrossRefMATHMathSciNet

10.

Küçüktunç, O., Kaya, K., Saule, E., Çatalyürek, Ü.V.: Fast recommendation on bibliographic networks. In: Proceedings of the ASONAM’12, August 2012

11.

Mellor-Crummey, J., Garvin, J.: Optimizing sparse matrix-vector product computations using unroll and jam. Int. J. High Perform. Comput. Appl. 18(2), 225–236 (2004)CrossRef

12.

Nishtala, R., Vuduc, R.W., Demmel, J.W., Yelick, K.A.: When cache blocking of sparse matrix vector multiply works and why. Appl. Algebra Eng. Commun. Comput. 18(3), 297–311 (2007)CrossRefMATHMathSciNet

13.

Potluri, S., Tomko, K., Bureddy, D., Panda, D.K.: Intra-MIC MPI communication using MVAPICH2: early experience. In: TACC-Intel Highly Parallel Computing Symposium 2012 (2012)

14.

Saad, Y.: Sparskit: a basic tool kit for sparse matrix computations - version 2 (1994)

15.

Saule, E., Çatalyürek, Ü.V.: An early evaluation of the scalability of graph algorithms on the Intel MIC architecture. In: IPDPS Workshop MTAAP (2012)

16.

Saule, E., Kaya, K., Çatalyürek, Ü.V.: Performance evaluation of sparse matrix multiplication kernels on intel xeon phi. Technical Report arXiv:1302.1078, ArXiv, Feb. 2013

17.

Stock, K., Pouchet, L.-N., Sadayappan, P.: Automatic transformations for effective parallel execution on intel many integrated core. In: TACC-Intel Highly Parallel Computing Symposium (2012)

18.

Vuduc, R., Demmel, J., Yelic, K.: OSKI: a library of automatically tuned sparse matrix kernels. In: Proceedings of the SciDAC 2005, J. of Physics: Conference Series (2005)

19.

Williams, S., Oliker, L., Vuduc, R., Shalf, J., Yelick, K., Demmel, J.: Optimization of sparse matrix-vector multiplication on emerging multicore platforms. In: Proceedings of the SC ’07 (2007)

20.

Zhou, Z., Saule, E., Aktulga, H.M., Yang, C., Ng, E.G., Maris, P., Vary, J.P., Çatalyürek, Ü.V.: An out-of-core eigensolver on SSD-equipped clusters. In: Proceedings of the IEEE Cluster, September 2012

Titel: Performance Evaluation of Sparse Matrix Multiplication Kernels on Intel Xeon Phi
verfasst von: Erik Saule
Kamer Kaya
Ümit V. Çatalyürek
Verlag: Springer Berlin Heidelberg
Buch: Parallel Processing and Applied Mathematics
Print ISBN: 978-3-642-55223-6

Electronic ISBN: 978-3-642-55224-3

Copyright-Jahr: 2014
DOI: https://doi.org/10.1007/978-3-642-55224-3_52

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner