skip to main content
research-article

Realtime top-k personalized pagerank over large graphs on GPUs

Authors Info & Claims
Published:01 September 2019Publication History
Skip Abstract Section

Abstract

Given a graph G, a source node sG and a positive integer k, a top-k Personalized PageRank (PPR) query returns the k nodes with the highest PPR values with respect to s, where the PPR of a node v measures its relevance from the perspective of source s. Top-k PPR processing is a fundamental task in many important applications such as web search, social networks, and graph analytics. This paper aims to answer such a query in realtime, i.e., within less than 100ms, on an Internet-scale graph with billions of edges. This is far beyond the current state of the art, due to the immense computational cost of processing a PPR query. We achieve this goal with a novel algorithm kPAR, which utilizes the massive parallel processing power of GPUs.

The main challenge in designing a GPU-based PPR algorithm lies in that a GPU is mainly a parallel computation device, whereas PPR processing involves graph traversals and value propagation operations, which are inherently sequential and memory-bound. Existing scalable PPR algorithms are mostly described as single-thread CPU solutions that are resistant to parallelization. Further, they usually involve complex data structures which do not have efficient adaptations on GPUs. kPAR overcomes these problems via both novel algorithmic designs (namely, adaptive forward push and inverted random walks) and system engineering (e.g., load balancing) to realize the potential of GPUs. Meanwhile, kPAR provides rigorous guarantees on both result quality and worst-case efficiency. Extensive experiments show that kPAR is usually 10x faster than parallel adaptations of existing methods. Notably, on a billion-edge Twitter graph, kPAR answers a top-1000 PPR query in 42.4 milliseconds.

References

  1. https://www.nngroup.com/articles/response-times-3-important-limits/.Google ScholarGoogle Scholar
  2. https://github.com/wangsibovictor/fora.Google ScholarGoogle Scholar
  3. https://github.com/wzskytop/TopPPR.Google ScholarGoogle Scholar
  4. http://snap.stanford.edu/data.Google ScholarGoogle Scholar
  5. http://law.di.unimi.it/datasets.php.Google ScholarGoogle Scholar
  6. https://sites.google.com/view/kpar-tr.Google ScholarGoogle Scholar
  7. https://developer.nvidia.com/nvgraph.Google ScholarGoogle Scholar
  8. R. Andersen, C. Borgs, J. T. Chayes, J. E. Hopcroft, V. S. Mirrokni, and S. Teng. Local computation of pagerank contributions. In WAW, pages 150--165, 2007.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. R. Andersen, F. R. K. Chung, and K. J. Lang. Local graph partitioning using pagerank vectors. In FOCS, pages 475--486, 2006.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Ashari, N. Sedaghati, J. Eisenlohr, S. Parthasarathy, and P. Sadayappan. Fast sparse matrix-vector multiplication on gpus for graph applications. In SC, pages 781--792, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. L. Backstrom and J. Leskovec. Supervised random walks: predicting and recommending links in social networks. In WSDM, pages 635--644, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. B. Bahmani, K. Chakrabarti, and D. Xin. Fast personalized pagerank on mapreduce. In SIGMOD, pages 973--984, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. B. Bahmani, A. Chowdhury, and A. Goel. Fast incremental and personalized pagerank. PVLDB, 4(3):173--184, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. S. Chakrabarti. Dynamic personalized pagerank in entity-relation graphs. In WWW, pages 571--580, 2007.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. F. R. K. Chung and L. Lu. Survey: Concentration inequalities and martingale inequalities: A survey. Internet Mathematics, 3(1):79--127, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  16. L. Dagum and R. Menon. Openmp: An industry-standard api for shared-memory programming. CiSE, pages 46--55, 1998.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. Fogaras, B. Rácz, K. Csalogány, and T. Sarlos. Towards scaling fully personalized pagerank: Algorithms, lower bounds, and experiments. Internet Mathematics, 2(3):333--358, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  18. Y. Fujiwara, M. Nakatsuji, H. Shiokawa, T. Mishima, and M. Onizuka. Efficient ad-hoc search for personalized pagerank. In SIGMOD, pages 445--456, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Y. Fujiwara, M. Nakatsuji, H. Shiokawa, T. Mishima, and M. Onizuka. Fast and exact top-k algorithm for pagerank. In AAAI, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Y. Fujiwara, M. Nakatsuji, T. Yamamuro, H. Shiokawa, and M. Onizuka. Efficient personalized pagerank with accuracy assurance. In KDD, pages 15--23, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. T. Guo, X. Cao, G. Cong, J. Lu, and X. Lin. Distributed algorithms on exact personalized pagerank. In SIGMOD, pages 479--494, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. W. Guo, Y. Li, M. Sha, and K.-L. Tan. Parallel personalized pagerank on dynamic graphs. PVLDB, 11(1):93--106, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. S. Gupta, A. Pathak, and S. Chakrabarti. Fast algorithms for topk personalized pagerank queries. In WWW, pages 1225--1226, 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. P. Gupta, A. Goel, J. Lin, A. Sharma, D. Wang, and R. Zadeh. Wtf: The who to follow service at twitter. In WWW, pages 505--514, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. P. Harish and P. Narayanan. Accelerating large graph algorithms on the gpu using cuda. In HiPC, pages 197--208, 2007.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. G. Iván and V. Grolmusz. When the web meets the cell: using personalized pagerank for analyzing protein interaction networks. Bioinformatics, pages 405--407, 2011.Google ScholarGoogle Scholar
  27. K. Järvelin and J. Kekäläinen. Ir evaluation methods for retrieving highly relevant documents. In SIGIR, pages 41--48, 2000.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. G. Jeh and J. Widom. Scaling personalized web search. In WWW, pages 271--279, 2003.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. Jung, N. Park, L. Sael, and U. Kang. Bepi: Fast and memory-efficient method for billion-scale random walk with restart. In SIGMOD, pages 789--804, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. J. Klicpera, A. Bojchevski, and S. Gnnemann. Predict then propagate: Graph neural networks meet personalized pagerank. In ICLR, 2019.Google ScholarGoogle Scholar
  31. W. Lin. Distributed algorithms for fully personalized pagerank on large graphs. In WWW, 2019.Google ScholarGoogle Scholar
  32. D. C. Liu, S. Rogers, R. Shiau, D. Kislyuk, K. C. Ma, Z. Zhong, J. Liu, and Y. Jing. Related pins at pinterest: The evolution of a real-world recommender system. In WWW Companion, pages 583--592, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. H. Liu and H. H. Huang. Enterprise: Breadth-first graph traversal on gpus. In SC, pages 1--12, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. H. Liu, H. H. Huang, and Y. Hu. ibfs: Concurrent breadth-first search on gpus. In SIGMOD, pages 403--416, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. P. Lofgren, S. Banerjee, and A. Goel. Bidirectional pagerank estimation: From average-case to worst-case. In WAW 2015, pages 164--176, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. P. Lofgren, S. Banerjee, and A. Goel. Personalized pagerank estimation and search: A bidirectional approach. In WSDM, pages 163--172, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. P. A. Lofgren, S. Banerjee, A. Goel, and C. Seshadhri. Fast-ppr: Scaling personalized pagerank estimation for large graphs. In KDD, pages 1436--1445, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. L. Luo, M. Wong, and W.-m. Hwu. An effective gpu implementation of breadth-first search. In DAC, pages 52--55, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. S. Luo, X. Xiao, W. Lin, and B. Kao. Efficient batch one-hop personalized pageranks. In ICDE, pages 245--256, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  40. T. Maehara, T. Akiba, Y. Iwata, and K.-i. Kawarabayashi. Computing personalized pagerank quickly by exploiting graph structures. PVLDB, 7(12):1023--1034, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. D. Merrill, M. Garland, and A. Grimshaw. Scalable gpu graph traversal. In PPOPP, pages 117--128, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. D. Merrill, M. Garland, and A. S. Grimshaw. High-performance and scalable GPU graph traversal. TOPC, pages 14:1--14:30, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. D. Nguyen, A. Lenharth, and K. Pingali. A lightweight infrastructure for graph analytics. In SIGOPS, pages 456--471, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. J. Nielsen. Usability engineering. In The Computer Science and Engineering Handbook, pages 1440--1460. 1997.Google ScholarGoogle Scholar
  45. L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: bringing order to the web. 1999.Google ScholarGoogle Scholar
  46. A. D. Robison. Composable parallel patterns with intel cilk plus. CiSE, page 66, 2013.Google ScholarGoogle Scholar
  47. A. D. Sarma, A. R. Molla, G. Pandurangan, and E. Upfal. Fast distributed pagerank computation. In ICDCN, pages 11--26, 2013.Google ScholarGoogle Scholar
  48. K. Shin, J. Jung, L. Sael, and U. Kang. BEAR: block elimination approach for random walk with restart on large graphs. In SIGMOD, pages 1571--1585, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. J. Shun and G. E. Blelloch. Ligra: a lightweight graph processing framework for shared memory. In PPoPP, pages 135--146, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. A. Walker. New fast method for generating discrete random numbers with arbitrary frequency distributions. Electronics Letters, pages 127 -- 128, 1974.Google ScholarGoogle ScholarCross RefCross Ref
  51. S. Wang, Y. Tang, X. Xiao, Y. Yang, and Z. Li. Hubppr: Effective indexing for approximate personalized pagerank. PVLDB, 10(3):205--216, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. S. Wang, R. Yang, X. Xiao, Z. Wei, and Y. Yang. Fora: Simple and effective approximate single-source personalized pagerank. In KDD, pages 505--514, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Y. Wang, A. A. Davidson, Y. Pan, Y. Wu, A. Riffel, and J. D. Owens. Gunrock: a high-performance graph processing library on the GPU. In PPoPP, pages 11:1--11:12, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Z. Wei, X. He, X. Xiao, S. Wang, S. Shang, and J.-R. Wen. Topppr: top-k personalized pagerank queries with precision guarantees on large graphs. In SIGMOD, pages 441--456, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. N. Whitehead and A. Fit-florea. Precision & performance: Floating point and ieee 754 compliance for nvidia gpus, 2011.Google ScholarGoogle Scholar
  56. X. Yang, S. Parthasarathy, and P. Sadayappan. Fast sparse matrix-vector multiplication on gpus: implications for graph mining. PVLDB, 4(4):231--242, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. W. Yu and X. Lin. IRWR: incremental random walk with restart. In SIGIR, pages 1017--1020, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. H. Zhang, P. Lofgren, and A. Goel. Approximate personalized pagerank on dynamic graphs. In KDD, pages 1315--1324, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. F. Zhu, Y. Fang, K. C. Chang, and J. Ying. Incremental and accuracy-aware personalized pagerank through scheduled approximation. PVLDB, 6(6):481--492, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 13, Issue 1
    September 2019
    85 pages
    ISSN:2150-8097
    Issue’s Table of Contents

    Publisher

    VLDB Endowment

    Publication History

    • Published: 1 September 2019
    Published in pvldb Volume 13, Issue 1

    Qualifiers

    • research-article

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader