ABSTRACT
A matrix multiplication is a building block for social networks analysis. Recently, there have been various methods proposed for GPU-based matrix multiplications. NVIDIA, one of major manufacturers of GPUs, has also proposed various matrix multiplication methods based on GPUs. In this paper, we introduce the methods, and evaluate their performance via extensive experiments using synthetic and real-world datasets. Our results would help practitioners choose the best one for analyzing real-world social networks.
- D. Kirk and W. Hwu, Programming Massively Parallel Processors, Morgan Kaufmann, 2010. Google ScholarDigital Library
- V. Volkov and J. Demmel, "Benchmarking GPUs to Tune Dense Linear Algebra," In Proc. of Int'l Conf. on Supercomputing, SC, pp. 1--11, 2008. Google ScholarDigital Library
- G. He et al., "Parallel SimRank Computation on Large Graphs with Iterative Aggregation," In Proc. ACM Int'l Conf. on Knowledge discovery and data mining, ACM SIGKDD, pp. 543--552, 2010. Google ScholarDigital Library
- D. Bae, S. Hwang, and S. Kim, "Constructing Seminal Paper Genealogy," In Proc. ACM Int'l Conf. on Information and knowledge management, ACM CIKM, pp. 2101--2104, 2011. Google ScholarDigital Library
- Koren et al., "Matrix factorization techniques for recommender systems," Computer, Vol. 42, No. 8, pp. 30--37, 2009. Google ScholarDigital Library
- NVIDIA CUPARSE and CUBLAS libraries, https://developer.nvidia.com/cuda-toolkitGoogle Scholar
- csrgemm library, http://on-demand.gputechconf.com/gtc/2012/presentations/S0285-GTC2012-Sparse-Matrix-Multiplication.pdfGoogle Scholar
- X. Yang, S. Parthasarathy, and P. Sadayappan, "Fast Sparse Matrix-Vector Multiplication on GPUs: Implications for Graph Mining," VLDB Endowment, Vol. 4, No. 4, pp. 231--242, 2011. Google ScholarDigital Library
- S. Ryoo et al., "Optimization Principles and Application Performance Evaluation of a Multithreaded GPU using CUDA," In Proc. ACM Int'l Symp. on Principles and practice of parallel programming, ACM SIGPLAN, pp. 73--82, 2008. Google ScholarDigital Library
- N. Bell and M. Garland, Efficient Sparse Matrix-Vector Multiplication on CUDA, NVIDIA Technical Report, NVIDIA Corporation, 2008.Google Scholar
- Geforce GT 440 specification, http://www.geforce.com/hardware/desktop-gpus/geforce-gt-440-channelGoogle Scholar
- Tesla specification, http://www.nvidia.co.kr/content/PDF/kepler/Tesla-K20-Active-BD-06499-001-v04.pdfGoogle Scholar
- Stanford Large Network Dataset Collection, http://snap.stanford.edu/data/Google Scholar
- IMC 2007 Data Sets, http://socialnetworks.mpi-sws.org/data-imc2007.htmlGoogle Scholar
Index Terms
- GPU-based matrix multiplication methods for social networks analysis
Recommendations
Improving Performance of Matrix Multiplication and FFT on GPU
ICPADS '09: Proceedings of the 2009 15th International Conference on Parallel and Distributed SystemsIn this paper we discuss about our experiences in improving the performance of two key algorithms: the single-precision matrix-matrix multiplication subprogram (SGEMM of BLAS) and single-precision FFT using CUDA. The former is computation-intensive, ...
HPMaX: heterogeneous parallel matrix multiplication using CPUs and GPUs
AbstractWe present a novel heterogeneous parallel matrix multiplication algorithm that utilizes both central processing units (CPUs) and graphics processing units (GPUs) for large-scale matrices. Based on Strassen’s method, we represent matrix ...
Performance Tuning of Matrix Multiplication in OpenCL on Different GPUs and CPUs
SCC '12: Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and AnalysisOpenCL (Open Computing Language) is a framework for general-purpose parallel programming. Programs written in OpenCL are functionally portable across multiple processors including CPUs, GPUs, and also FPGAs. Using an auto-tuning technique makes ...
Comments