skip to main content
research-article

HubPPR: effective indexing for approximate personalized pagerank

Authors Info & Claims
Published:01 November 2016Publication History
Skip Abstract Section

Abstract

Personalized PageRank (PPR) computation is a fundamental operation in web search, social networks, and graph analysis. Given a graph G, a source s, and a target t, the PPR query Π(s, t) returns the probability that a random walk on G starting from s terminates at t. Unlike global PageRank which can be effectively pre-computed and materialized, the PPR result depends on both the source and the target, rendering results materialization infeasible for large graphs. Existing indexing techniques have rather limited effectiveness; in fact, the current state-of-the-art solution, BiPPR, answers individual PPR queries without pre-computation or indexing, and yet it outperforms all previous index-based solutions.

Motivated by this, we propose HubPPR, an effective indexing scheme for PPR computation with controllable tradeoffs for accuracy, query time, and memory consumption. The main idea is to pre-compute and index auxiliary information for selected hub nodes that are often involved in PPR processing. Going one step further, we extend HubPPR to answer top-k PPR queries, which returns the k nodes with the highest PPR values with respect to a source s, among a given set T of target nodes. Extensive experiments demonstrate that compared to the current best solution BiPPR, HubPPR achieves up to 10x and 220x speedup for PPR and top-k PPR processing, respectively, with moderate memory consumption. Notably, with a single commodity server, HubPPR answers a top-k PPR query in seconds on graphs with billions of edges, with high accuracy and strong result quality guarantees.

References

  1. https://sites.google.com/site/hubpprtr2016/.Google ScholarGoogle Scholar
  2. http://snap.stanford.edu/data.Google ScholarGoogle Scholar
  3. R. Andersen, C. Borgs, J. T. Chayes, J. E. Hopcroft, V. S. Mirrokni, and S. Teng. Local computation of pagerank contributions. In WAW, pages 150--165, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. Andersen, F. R. K. Chung, and K. J. Lang. Local graph partitioning using pagerank vectors. In FOCS, pages 475--486, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. K. Avrachenkov, N. Litvak, D. Nemirovsky, and N. Osipova. Monte carlo methods in pagerank computation: When one iteration is sufficient. SIAM J. Numerical Analysis, 45(2):890--904, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. K. Avrachenkov, N. Litvak, D. Nemirovsky, E. Smirnova, and M. Sokol. Quick detection of top-k personalized pagerank lists. In WAW, pages 50--61, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. L. Backstrom and J. Leskovec. Supervised random walks: predicting and recommending links in social networks. In WSDM, pages 635--644, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. B. Bahmani, K. Chakrabarti, and D. Xin. Fast personalized pagerank on mapreduce. In SIGMOD, pages 973--984, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. P. Berkhin. Survey: A survey on pagerank computing. Internet Mathematics, 2(1):73--120, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  10. P. Berkhin. Bookmark-coloring algorithm for personalized pagerank computing. Internet Mathematics, 3(1):41--62, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  11. S. Chakrabarti. Dynamic personalized pagerank in entity-relation graphs. In WWW, pages 571--580, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. F. R. K. Chung and L. Lu. Survey: Concentration inequalities and martingale inequalities: A survey. Internet Mathematics, 3(1):79--127, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  13. D. Fogaras, B. Rácz, K. Csalogány, and T. Sarlós. Towards scaling fully personalized pagerank: Algorithms, lower bounds, and experiments. Internet Mathematics, 2(3):333--358, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  14. Y. Fujiwara, M. Nakatsuji, H. Shiokawa, T. Mishima, and M. Onizuka. Efficient ad-hoc search for personalized pagerank. In SIGMOD, pages 445--456, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Y. Fujiwara, M. Nakatsuji, T. Yamamuro, H. Shiokawa, and M. Onizuka. Efficient personalized pagerank with accuracy assurance. In KDD, pages 15--23, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. F. L. Gall. Powers of tensors and fast matrix multiplication. In ISSAC, pages 296--303, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. S. Gupta, A. Pathak, and S. Chakrabarti. Fast algorithms for topk personalized pagerank queries. In WWW, pages 1225--1226, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. Gupta, A. Goel, J. Lin, A. Sharma, D. Wang, and R. Zadeh. Wtf: The who to follow service at twitter. In WWW, pages 505--514, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. G. Iván and V. Grolmusz. When the web meets the cell: using personalized pagerank for analyzing protein interaction networks. Bioinformatics, 27(3):405--407, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. G. Jeh and J. Widom. Scaling personalized web search. In WWW, pages 271--279, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. P. Lofgren, S. Banerjee, and A. Goel. Personalized pagerank estimation and search: A bidirectional approach. In WSDM, pages 163--172, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. P. A. Lofgren, S. Banerjee, A. Goel, and C. Seshadhri. Fast-ppr: Scaling personalized pagerank estimation for large graphs. In KDD, pages 1436--1445, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. T. Maehara, T. Akiba, Y. Iwata, and K.-i. Kawarabayashi. Computing personalized pagerank quickly by exploiting graph structures. PVLDB, 7(12):1023--1034, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. R. Motwani and P. Raghavan. Randomized algorithms. Chapman & Hall/CRC, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: bringing order to the web. 1999.Google ScholarGoogle Scholar
  26. A. D. Sarma, A. R. Molla, G. Pandurangan, and E. Upfal. Fast distributed pagerank computation. In ICDCN, pages 11--26, 2013.Google ScholarGoogle Scholar
  27. K. Shin, J. Jung, L. Sael, and U. Kang. BEAR: block elimination approach for random walk with restart on large graphs. In SIGMOD, pages 1571--1585, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. D. Williams. Probability with martingales. Cambridge university press, 1991.Google ScholarGoogle Scholar
  29. F. Zhu, Y. Fang, K. C. Chang, and J. Ying. Incremental and accuracy-aware personalized pagerank through scheduled approximation. PVLDB, 6(6):481--492, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. HubPPR: effective indexing for approximate personalized pagerank
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image Proceedings of the VLDB Endowment
      Proceedings of the VLDB Endowment  Volume 10, Issue 3
      November 2016
      216 pages
      ISSN:2150-8097
      Issue’s Table of Contents

      Publisher

      VLDB Endowment

      Publication History

      • Published: 1 November 2016
      Published in pvldb Volume 10, Issue 3

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader