Abstract
Given a graph G, a source node s ∈ G and a positive integer k, a top-k Personalized PageRank (PPR) query returns the k nodes with the highest PPR values with respect to s, where the PPR of a node v measures its relevance from the perspective of source s. Top-k PPR processing is a fundamental task in many important applications such as web search, social networks, and graph analytics. This paper aims to answer such a query in realtime, i.e., within less than 100ms, on an Internet-scale graph with billions of edges. This is far beyond the current state of the art, due to the immense computational cost of processing a PPR query. We achieve this goal with a novel algorithm kPAR, which utilizes the massive parallel processing power of GPUs.
The main challenge in designing a GPU-based PPR algorithm lies in that a GPU is mainly a parallel computation device, whereas PPR processing involves graph traversals and value propagation operations, which are inherently sequential and memory-bound. Existing scalable PPR algorithms are mostly described as single-thread CPU solutions that are resistant to parallelization. Further, they usually involve complex data structures which do not have efficient adaptations on GPUs. kPAR overcomes these problems via both novel algorithmic designs (namely, adaptive forward push and inverted random walks) and system engineering (e.g., load balancing) to realize the potential of GPUs. Meanwhile, kPAR provides rigorous guarantees on both result quality and worst-case efficiency. Extensive experiments show that kPAR is usually 10x faster than parallel adaptations of existing methods. Notably, on a billion-edge Twitter graph, kPAR answers a top-1000 PPR query in 42.4 milliseconds.
- https://www.nngroup.com/articles/response-times-3-important-limits/.Google Scholar
- https://github.com/wangsibovictor/fora.Google Scholar
- https://github.com/wzskytop/TopPPR.Google Scholar
- http://snap.stanford.edu/data.Google Scholar
- http://law.di.unimi.it/datasets.php.Google Scholar
- https://sites.google.com/view/kpar-tr.Google Scholar
- https://developer.nvidia.com/nvgraph.Google Scholar
- R. Andersen, C. Borgs, J. T. Chayes, J. E. Hopcroft, V. S. Mirrokni, and S. Teng. Local computation of pagerank contributions. In WAW, pages 150--165, 2007.Google ScholarDigital Library
- R. Andersen, F. R. K. Chung, and K. J. Lang. Local graph partitioning using pagerank vectors. In FOCS, pages 475--486, 2006.Google ScholarDigital Library
- A. Ashari, N. Sedaghati, J. Eisenlohr, S. Parthasarathy, and P. Sadayappan. Fast sparse matrix-vector multiplication on gpus for graph applications. In SC, pages 781--792, 2014.Google ScholarDigital Library
- L. Backstrom and J. Leskovec. Supervised random walks: predicting and recommending links in social networks. In WSDM, pages 635--644, 2011.Google ScholarDigital Library
- B. Bahmani, K. Chakrabarti, and D. Xin. Fast personalized pagerank on mapreduce. In SIGMOD, pages 973--984, 2011.Google ScholarDigital Library
- B. Bahmani, A. Chowdhury, and A. Goel. Fast incremental and personalized pagerank. PVLDB, 4(3):173--184, 2010.Google ScholarDigital Library
- S. Chakrabarti. Dynamic personalized pagerank in entity-relation graphs. In WWW, pages 571--580, 2007.Google ScholarDigital Library
- F. R. K. Chung and L. Lu. Survey: Concentration inequalities and martingale inequalities: A survey. Internet Mathematics, 3(1):79--127, 2006.Google ScholarCross Ref
- L. Dagum and R. Menon. Openmp: An industry-standard api for shared-memory programming. CiSE, pages 46--55, 1998.Google ScholarDigital Library
- D. Fogaras, B. Rácz, K. Csalogány, and T. Sarlos. Towards scaling fully personalized pagerank: Algorithms, lower bounds, and experiments. Internet Mathematics, 2(3):333--358, 2005.Google ScholarCross Ref
- Y. Fujiwara, M. Nakatsuji, H. Shiokawa, T. Mishima, and M. Onizuka. Efficient ad-hoc search for personalized pagerank. In SIGMOD, pages 445--456, 2013.Google ScholarDigital Library
- Y. Fujiwara, M. Nakatsuji, H. Shiokawa, T. Mishima, and M. Onizuka. Fast and exact top-k algorithm for pagerank. In AAAI, 2013.Google ScholarDigital Library
- Y. Fujiwara, M. Nakatsuji, T. Yamamuro, H. Shiokawa, and M. Onizuka. Efficient personalized pagerank with accuracy assurance. In KDD, pages 15--23, 2012.Google ScholarDigital Library
- T. Guo, X. Cao, G. Cong, J. Lu, and X. Lin. Distributed algorithms on exact personalized pagerank. In SIGMOD, pages 479--494, 2017.Google ScholarDigital Library
- W. Guo, Y. Li, M. Sha, and K.-L. Tan. Parallel personalized pagerank on dynamic graphs. PVLDB, 11(1):93--106, 2017.Google ScholarDigital Library
- M. S. Gupta, A. Pathak, and S. Chakrabarti. Fast algorithms for topk personalized pagerank queries. In WWW, pages 1225--1226, 2008.Google ScholarDigital Library
- P. Gupta, A. Goel, J. Lin, A. Sharma, D. Wang, and R. Zadeh. Wtf: The who to follow service at twitter. In WWW, pages 505--514, 2013.Google ScholarDigital Library
- P. Harish and P. Narayanan. Accelerating large graph algorithms on the gpu using cuda. In HiPC, pages 197--208, 2007.Google ScholarDigital Library
- G. Iván and V. Grolmusz. When the web meets the cell: using personalized pagerank for analyzing protein interaction networks. Bioinformatics, pages 405--407, 2011.Google Scholar
- K. Järvelin and J. Kekäläinen. Ir evaluation methods for retrieving highly relevant documents. In SIGIR, pages 41--48, 2000.Google ScholarDigital Library
- G. Jeh and J. Widom. Scaling personalized web search. In WWW, pages 271--279, 2003.Google ScholarDigital Library
- J. Jung, N. Park, L. Sael, and U. Kang. Bepi: Fast and memory-efficient method for billion-scale random walk with restart. In SIGMOD, pages 789--804, 2017.Google ScholarDigital Library
- J. Klicpera, A. Bojchevski, and S. Gnnemann. Predict then propagate: Graph neural networks meet personalized pagerank. In ICLR, 2019.Google Scholar
- W. Lin. Distributed algorithms for fully personalized pagerank on large graphs. In WWW, 2019.Google Scholar
- D. C. Liu, S. Rogers, R. Shiau, D. Kislyuk, K. C. Ma, Z. Zhong, J. Liu, and Y. Jing. Related pins at pinterest: The evolution of a real-world recommender system. In WWW Companion, pages 583--592, 2017.Google ScholarDigital Library
- H. Liu and H. H. Huang. Enterprise: Breadth-first graph traversal on gpus. In SC, pages 1--12, 2015.Google ScholarDigital Library
- H. Liu, H. H. Huang, and Y. Hu. ibfs: Concurrent breadth-first search on gpus. In SIGMOD, pages 403--416, 2016.Google ScholarDigital Library
- P. Lofgren, S. Banerjee, and A. Goel. Bidirectional pagerank estimation: From average-case to worst-case. In WAW 2015, pages 164--176, 2015.Google ScholarDigital Library
- P. Lofgren, S. Banerjee, and A. Goel. Personalized pagerank estimation and search: A bidirectional approach. In WSDM, pages 163--172, 2016.Google ScholarDigital Library
- P. A. Lofgren, S. Banerjee, A. Goel, and C. Seshadhri. Fast-ppr: Scaling personalized pagerank estimation for large graphs. In KDD, pages 1436--1445, 2014.Google ScholarDigital Library
- L. Luo, M. Wong, and W.-m. Hwu. An effective gpu implementation of breadth-first search. In DAC, pages 52--55, 2010.Google ScholarDigital Library
- S. Luo, X. Xiao, W. Lin, and B. Kao. Efficient batch one-hop personalized pageranks. In ICDE, pages 245--256, 2019.Google ScholarCross Ref
- T. Maehara, T. Akiba, Y. Iwata, and K.-i. Kawarabayashi. Computing personalized pagerank quickly by exploiting graph structures. PVLDB, 7(12):1023--1034, 2014.Google ScholarDigital Library
- D. Merrill, M. Garland, and A. Grimshaw. Scalable gpu graph traversal. In PPOPP, pages 117--128, 2012.Google ScholarDigital Library
- D. Merrill, M. Garland, and A. S. Grimshaw. High-performance and scalable GPU graph traversal. TOPC, pages 14:1--14:30, 2015.Google ScholarDigital Library
- D. Nguyen, A. Lenharth, and K. Pingali. A lightweight infrastructure for graph analytics. In SIGOPS, pages 456--471, 2013.Google ScholarDigital Library
- J. Nielsen. Usability engineering. In The Computer Science and Engineering Handbook, pages 1440--1460. 1997.Google Scholar
- L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: bringing order to the web. 1999.Google Scholar
- A. D. Robison. Composable parallel patterns with intel cilk plus. CiSE, page 66, 2013.Google Scholar
- A. D. Sarma, A. R. Molla, G. Pandurangan, and E. Upfal. Fast distributed pagerank computation. In ICDCN, pages 11--26, 2013.Google Scholar
- K. Shin, J. Jung, L. Sael, and U. Kang. BEAR: block elimination approach for random walk with restart on large graphs. In SIGMOD, pages 1571--1585, 2015.Google ScholarDigital Library
- J. Shun and G. E. Blelloch. Ligra: a lightweight graph processing framework for shared memory. In PPoPP, pages 135--146, 2013.Google ScholarDigital Library
- A. Walker. New fast method for generating discrete random numbers with arbitrary frequency distributions. Electronics Letters, pages 127 -- 128, 1974.Google ScholarCross Ref
- S. Wang, Y. Tang, X. Xiao, Y. Yang, and Z. Li. Hubppr: Effective indexing for approximate personalized pagerank. PVLDB, 10(3):205--216, 2016.Google ScholarDigital Library
- S. Wang, R. Yang, X. Xiao, Z. Wei, and Y. Yang. Fora: Simple and effective approximate single-source personalized pagerank. In KDD, pages 505--514, 2017.Google ScholarDigital Library
- Y. Wang, A. A. Davidson, Y. Pan, Y. Wu, A. Riffel, and J. D. Owens. Gunrock: a high-performance graph processing library on the GPU. In PPoPP, pages 11:1--11:12, 2016.Google ScholarDigital Library
- Z. Wei, X. He, X. Xiao, S. Wang, S. Shang, and J.-R. Wen. Topppr: top-k personalized pagerank queries with precision guarantees on large graphs. In SIGMOD, pages 441--456, 2018.Google ScholarDigital Library
- N. Whitehead and A. Fit-florea. Precision & performance: Floating point and ieee 754 compliance for nvidia gpus, 2011.Google Scholar
- X. Yang, S. Parthasarathy, and P. Sadayappan. Fast sparse matrix-vector multiplication on gpus: implications for graph mining. PVLDB, 4(4):231--242, 2011.Google ScholarDigital Library
- W. Yu and X. Lin. IRWR: incremental random walk with restart. In SIGIR, pages 1017--1020, 2013.Google ScholarDigital Library
- H. Zhang, P. Lofgren, and A. Goel. Approximate personalized pagerank on dynamic graphs. In KDD, pages 1315--1324, 2016.Google ScholarDigital Library
- F. Zhu, Y. Fang, K. C. Chang, and J. Ying. Incremental and accuracy-aware personalized pagerank through scheduled approximation. PVLDB, 6(6):481--492, 2013.Google ScholarDigital Library
Recommendations
Efficient PageRank and SpMV Computation on AMD GPUs
ICPP '10: Proceedings of the 2010 39th International Conference on Parallel ProcessingGoogle's famous PageRank algorithm is widely used to determine the importance of web pages in search engines. Given the large number of web pages on the World Wide Web, efficient computation of PageRank becomes a challenging problem. We accelerated the ...
On Analyzing Large Graphs Using GPUs
IPDPSW '13: Proceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing Workshops and PhD ForumStudying properties of graphs is essential to various applications, and recent growth of online social networks has spurred interests in analyzing their structures using Graphical Processing Units (GPUs). Utilizing the faster available shared memory on ...
Using GPUs to compute large out-of-card FFTs
ICS '11: Proceedings of the international conference on SupercomputingThe optimization of Fast Fourier Transfer (FFT) problems that can fit into GPU memory has been studied extensively. Such on-card FFT libraries like CUFFT can generally achieve much better performance than their counterparts on a CPU, as the data ...
Comments