Top

Published in:

2019 | OriginalPaper | Chapter

GPU_MF_SGD: A Novel GPU-Based Stochastic Gradient Descent Method for Matrix Factorization

Authors : Mohamed A. Nassar, Layla A. A. El-Sayed, Yousry Taha

Published in: Advances in Information and Communication Networks

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Recommender systems are used in most of nowadays applications. Providing real-time suggestions with high accuracy is considered as one of the most crucial challenges that face them. Matrix factorization (MF) is an effective technique for recommender systems as it improves the accuracy. Stochastic Gradient Descent (SGD) for MF is the most popular approach used to speed up MF. SGD is a sequential algorithm, which is not trivial to be parallelized, especially for large-scale problems. Recently, many researches have proposed parallel methods for parallelizing SGD. In this research, we propose GPU_MF_SGD, a novel GPU-based method for large-scale recommender systems. GPU_MF_SGD utilizes Graphics Processing Unit (GPU) resources by ensuring load balancing and linear scalability, and achieving coalesced access of global memory without preprocessing phase. Our method demonstrates 3.1X–5.4X speedup over the most state-of-the-art GPU method, CuMF_SGD.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter MUT-APR: MUTation-Based Automated Program Repair Research Tool

next chapter Effective Local Reconstruction Codes Based on Regeneration for Large-Scale Storage Systems

Ricci, F., et al.: Recommender Systems Handbook. Springer, New York (2011)CrossRef

Ekstrand, M.D., et al.: Collaborative filtering recommender systems. Found. Trends Hum. Comput. Interact. 4(2), 81–173 (2011)CrossRef

Poriya, A., et al.: Non-personalized recommender systems and user-based collaborative recommender systems. Int. J. Appl. Inf. Syst. 6(9), 22–27 (2014)

Aamir, M., Bhusry, M.: Recommendation system: state of the art approach. Int. J. Comput. Appl. 120, 25–32 (2015)

Recommender System. https://en.wikipedia.org/wiki/Recommender_system. Accessed 11 July 2017

Jin, J., et al.: GPUSGD: a GPU-accelerated stochastic gradient descent algorithm for matrix factorization. Concurr. Comput. Pract. Exp. 28, 3844–3865 (2016)CrossRef

Xie, X., et al.: CuMF_SGD: parallelized stochastic gradient descent for matrix factorization on GPUs. In: Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing. ACM (2017)

Li, H., et al.: MSGD: a novel matrix factorization approach for large-scale collaborative filtering recommender systems on GPUs. IEEE Trans. Parallel Distrib. Syst. 29(7), 1530–1544 (2018)CrossRef

Nassar, M.A., El-Sayed, L.A.A., Taha, Y.: Efficient parallel stochastic gradient descent for matrix factorization using GPU. In: 2016 11th International Conference for Internet Technology and Secured Transactions (ICITST). IEEE (2016)

10.

Wen, Z.: Recommendation system based on collaborative filtering. In: CS229 Lecture Notes, Stanford University, December 2008

11.

Leskovec, J., et al.: Mining of Massive Datasets, Chap. 9, pp. 307–340. Cambridge University Press, Cambridge (2014)

12.

Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender systems. Computer 42(8), 30–37 (2009)CrossRef

13.

Kaleem, R., et al.: Stochastic gradient descent on GPUs. In: Proceedings of the 8th Workshop on General Purpose Processing Using GPUs, pp. 81–89 (2015)

14.

Konstan, J.A., Riedl, J.: Recommender systems: from algorithms to user experience. User Model. User Adap. Inter. 22(1), 101–123 (2012)CrossRef

15.

Anastasiu, D.C., et al.: Big Data and Recommender Systems (2016)

16.

Melville, P., Sindhwani, V.: Recommender systems. In: Sammut, C., Webb, G.I. (eds.) Encyclopedia of Machine Learning, pp. 829–838. Springer, New York (2011)

17.

Kant, V., Bharadwaj, K.K.: Enhancing recommendation quality of content-based filtering through collaborative predictions and fuzzy similarity measures. J. Proc. Eng. 38, 939–944 (2012)CrossRef

18.

Ma, A., et al.: A FPGA-based accelerator for neighborhood-based collaborative filtering recommendation algorithms. In: Proceedings of IEEE International Conference on Cluster Computing, pp. 494–495, September 2015

19.

Anthony, V., Ayala, A., et al.: Speeding up collaborative filtering with parametrized preprocessing. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, Australia, August 2015

20.

Gates, M., et al.: Accelerating collaborative filtering using concepts from high performance computing. In: IEEE International Conference in Big Data (Big Data) (2015)

21.

Wang, Z., et al.: A CUDA-enabled parallel implementation of collaborative filtering. Proc. Comput. Sci. 30, 66–74 (2014)CrossRef

22.

Gemulla, R., Nijkamp, E., Haas, P.J., Sismanis, Y.: Large-scale matrix factorization with distributed stochastic gradient descent. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2011)

23.

Chin, W.-S., et al.: A fast parallel stochastic gradient method for matrix factorization in shared memory systems. ACM Trans. Intell. Syst. Technol. 6(1), 2 (2015)CrossRef

24.

Zastrau, D., Edelkamp, S.: Stochastic gradient descent with GPGPU. In: Proceedings of the 35th Annual German Conference on Advances in Artificial Intelligence (KI’12), pp. 193–204 (2012)

25.

Shah, A., Majumdar, A.: Accelerating low-rank matrix completion on GPUs. In: Proceedings of International Conference on Advances in Computing, Communications and Informatics, December 2014

26.

Kato, K., Hosino, T.: Singular value decomposition for collaborative filtering on a GPU. IOP Conf. Ser. Mater. Sci. Eng. 10(1), 012017 (2010)CrossRef

27.

Foster, B., et al.: A GPU-based approximate SVD algorithm. In: Proceedings of the 9th International Conference on Parallel Processing and Applied Mathematics, vol. 1, pp. 569–578. Springer, Berlin (2012)

28.

Yu, H.-F., et al.: Parallel matrix factorization for recommender systems. Knowl. Inf. Syst. 41(3), 793–819 (2014)CrossRef

29.

Yu, H.F., Hsieh, C.J., et al.: Scalable coordinate descent approaches to parallel matrix factorization for recommender systems. In: Proceedings of the IEEE 12th International Conference on Data Mining, pp. 765–774 (2012)

30.

Yun, H., Yu, H.-F., Hsieh, C.-J., Vishwanathan, S.V.N., Dhillon, I.: NOMAD: non-locking, stochastic multi-machine algorithm for asynchronous and decentralized matrix completion. Proc. VLDB Endow. 7(11), 975–986 (2014)CrossRef

31.

Yang, X., et al.: High performance coordinate descent matrix factorization for recommender systems. In: Proceedings of the Computing Frontiers Conference. ACM (2017)

32.

Zadeh, R., et al.: Matrix completion via alternating least square (ALS). In: CME 323 Lecture Notes, Stanford University, Spring (2016)

33.

Tan, W., Cao, L., Fong, L.: Faster and cheaper: parallelizing large-scale matrix factorization on GPUs. In: Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2016 (2016)

34.

Aberger, C.R.: Recommender: An Analysis of Collaborative Filtering Techniques (2016)

35.

Papamakarios, G.: Comparison of Modern Stochastic Optimization Algorithms (2014)

36.

Toulis, P., Airoldi, E., Rennie, J.: Statistical analysis of stochastic gradient methods for generalized linear models. In: International Conference on Machine Learning, pp. 667–675 (2014)

37.

Toulis, P., Tran, D., Airoldi, E.: Towards stability and optimality in stochastic gradient descent. In: Artificial Intelligence and Statistics, pp. 1290–1298 (2016)

38.

Zhou, Y., Wilkinson, D., et al.: Large-scale parallel collaborative filtering for the Netflix prize. In: Proceedings of International Conference on Algorithmic Aspects in Information and Management (2008)

39.

Xie, X., Tan, W., Fong, L.L., Liang, Y.: Cumf_sgd: fast and scalable matrix factorization (2016). arXiv preprint arXiv:1610.05838. https://github.com/cuMF/cumf_sgd

40.

Tang, K.: Collaborative filtering with batch stochastic gradient descent, July 2015. http://www.its.caltech.edu/~ktang/CS179/index.html

41.

Niu, F., et al.: HOGWILD!: a lock-free approach to parallelizing stochastic gradient descent. In: Advances in Neural Information Processing Systems, pp. 693–701, June 2011

42.

Gemulla, R., et al.: Large-scale matrix factorization with distributed stochastic gradient descent. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 69–77 (2011)

43.

Zhang, H., Hsieh, C.-J., Akella, V.: Hogwild++: a new mechanism for decentralized asynchronous stochastic gradient descent. In: 2016 IEEE 16th International Conference on Data Mining (ICDM), pp. 629–638. IEEE (2016)

44.

Zhang, C., Ré, C.: Dimmwitted: a study of main-memory statistical analytics. Proc. VLDB Endow. 7(12), 1283–1294 (2014)CrossRef

45.

Udell, M., et al.: Generalized low rank models. Found. Trends Mach. Learn. 9(1), 1–118 (2016)CrossRef

46.

CUDA C Programming Guide. http://docs.nvidia.com/cuda/cuda-c-programming-guide/#axzz4FH9nydq8. Accessed 5 Sept 2016

47.

Nunna, K.C., et al.: A survey on big data processing infrastructure: evolving role of FPGA. Int. J. Big Data Intell. 2(3), 145–156 (2015)

48.

Nassar, M.A., El-Sayed, L.A.A.: Radix-4 modified interleaved modular multiplier based on sign detection. In: International Conference on Computer Science and Information Technology, pp. 413–423. Springer, Berlin (2012)

49.

Nassar, M.A., El-Sayed, L.A.A.: Efficient interleaved modular multiplication based on sign detection. In: IEEE/ACS 12th International Conference of Computer Systems and Applications (AICCSA), November 2015

50.

Karydi, E., et al.: Parallel and distributed collaborative filtering: a survey. J. ACM Comput. Surv. 49(2), 37 (2016)

51.

Ma, X., Wang, C., Yu, Q., Li, X., Zhou, X.: A FPGA-based accelerator for neighborhood-based collaborative filtering recommendation algorithms. In: 2015 IEEE International Conference on Cluster Computing (CLUSTER), pp. 494–495. IEEE (2015)

52.

http://www.nvidia.com/object/tesla-k80.html. Accessed 22 July 2017

53.

Lathia, N.: Evaluating collaborative filtering over time. Ph.D. thesis (2010)

54.

Sparse Matrix. https://en.wikipedia.org/wiki/Sparse_matrix#Storing_a_sparse_matrix. Accessed 12 Feb 2017

55.

http://supercomputingblog.com/cuda/cudamemoryandcachearchitecture/. Accessed 26 June 2017

56.

GPU memory types – performance comparison. https://www.microway.com/hpc-tech-tips/gpu-memory-types. Accessed 5 Sept 2015

57.

Pankratius, V., et al.: Fundamentals of Multicore Software Development. CRC Press, Boca Raton (2011)CrossRef

58.

del Mundo, C., Feng, W.: Enabling efficient intra-warp communication for fourier transforms in a many-core architecture. In: Proceedings of the 2013 ACM/IEEE International Conference on Supercomputing (2013)

59.

Han, T.D., Abdelrahman, T.S.: Reducing branch divergence in GPU programs. In: Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units, p. 3. ACM (2011)

60.

Harper, F.M., Konstan, J.A.: The MovieLens datasets: history and context. ACM Trans. Interact. Intell. Syst. 5(4), 19 (2016)CrossRef

61.

Gower, S.: Netflix prize and SVD, pp. 1–10. http://buzzard.ups.edu/courses/2014spring/420projects/math420-UPS-spring-2014-gower-netflix-SVD.pdf (2014)

62.

Bennett, J., Lanning, S.: The Netflix prize. In: Proceedings of KDD Cup and Workshop, p. 35 (2007)

63.

Dror, G., Koenigstein, N., Koren, Y., Weimer, M.: The Yahoo! music dataset and KDD-Cup’11. In: Proceedings of KDD Cup 2011, pp. 3–18 (2012)

64.

Zheng, L.: Performance evaluation of latent factor models for rating prediction. Ph.D. dissertation, University of Victoria (2015)

65.

Low, Y., et al.: GraphLab: a new parallel framework for machine learning. In: Proceedings of the Twenty-Sixth Annual Conference on Uncertainty in Artificial Intelligence, UAI-10, pp. 340–349, July 2010

66.

Chin, W.-S., et al.: A learning-rate schedule for stochastic gradient methods to matrix factorization. In: PAKDD, pp. 442–455 (2015)

67.

https://hpc.bibalex.org/. Accessed July 2017

68.

https://slurm.schedmd.com/. Accessed July 2017

69.

Shani, G., Gunawardana, A.: Evaluating recommendation systems. In: Ricci, F., Rokach, L., Shapira, B., Kantor, P. (eds.) Recommender Systems Handbook, pp. 257–297. Springer, Boston (2011)CrossRef

70.

Ginger, T., Bochkov, Y.: Predicting business ratings on yelp report (2015). http://cs229.stanford.edu/proj2015/013_report.pdf

71.

Hwu, W.: Efficient host-device data transfer. In: Lecture Notes, University of Illinois at Urbana-Champaign, December 2014

72.

Bhatnagar, A.: Accelerating a movie recommender system using VirtualCL on a heterogeneous GPU cluster. Master thesis, July 2015

Title: GPU_MF_SGD: A Novel GPU-Based Stochastic Gradient Descent Method for Matrix Factorization
Authors: Mohamed A. Nassar
Layla A. A. El-Sayed
Yousry Taha
Publisher: Springer International Publishing
Book: Advances in Information and Communication Networks
Print ISBN: 978-3-030-03404-7

Electronic ISBN: 978-3-030-03405-4

Copyright Year: 2019
DOI: https://doi.org/10.1007/978-3-030-03405-4_18

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"