skip to main content
10.1145/3183713.3196920acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

TopPPR: Top-k Personalized PageRank Queries with Precision Guarantees on Large Graphs

Authors Info & Claims
Published:27 May 2018Publication History

ABSTRACT

Personalized PageRank (PPR) is a classic metric that measures the relevance of graph nodes with respect to a source node. Given a graph G, a source node s, and a parameter k, a top-k PPR query returns a set of k nodes with the highest PPR values with respect to s. This type of queries serves as an important building block for numerous applications in web search and social networks, such as Twitter's Who-To-Follow recommendation service. Existing techniques for top-k PPR, however, suffer from two major deficiencies. First, they either incur prohibitive space and time overheads on large graphs, or fail to provide any guarantee on the precision of top-k results (i.e., the results returned might miss a number of actual top-k answers). Second, most of them require significant pre-computation on the input graph G, which renders them unsuitable for graphs with frequent updates (e.g., Twitter's social graph).

To address the deficiencies of existing solutions, we propose PPR, an algorithm for top-k PPR queries that ensure at least ρ precision (i.e., at least ρ fraction of the actual top-k results are returned) with at least 1 - 1/n probability, where ρ ∈;n (0, 1] is a user-specified parameter and n is the number of nodes in G. In addition, PPR offers non-trivial guarantees on query time in terms of ρ, and it can easily handle dynamic graphs as it does not require any preprocessing. We experimentally evaluate PPR using a variety of benchmark datasets, and demonstrate that PPR outperforms the state-of-the-art solutions in terms of both efficiency and precision, even when we set ρ = 1 (i.e., when PPR returns the exact top-k results). Notably, on a billion-edge Twitter graph, PPR only requires 15 seconds to answer a top-500 PPR query with ρ = 1.

References

  1. https://spark-summit.org/2017/events/random-walks-on-large-scale-graphs-with-apache-spark/.Google ScholarGoogle Scholar
  2. http://compbio.case.edu/omics/software/chopper/index.html.Google ScholarGoogle Scholar
  3. https://github.com/YubaoWu/FLoS.Google ScholarGoogle Scholar
  4. https://datalab.snu.ac.kr/bepi/.Google ScholarGoogle Scholar
  5. http://snap.stanford.edu/data/index.html.Google ScholarGoogle Scholar
  6. http://law.di.unimi.it/datasets.php.Google ScholarGoogle Scholar
  7. Reid Andersen, Christian Borgs, Jennifer T. Chayes, John E. Hopcroft, Vahab S. Mirrokni, and Shang-Hua Teng. Local computation of pagerank contributions. In WAW, pages 150--165, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Reid Andersen, Fan R. K. Chung, and Kevin J. Lang. Local graph partitioning using pagerank vectors. In FOCS, pages 475--486, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Jean-Yves Audibert, Rémi Munos, and Csaba Szepesvári. Tuning bandit algorithms in stochastic environments. In ALT, volume 4754, pages 150--165. Springer, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Lars Backstrom and Jure Leskovec. Supervised random walks: predicting and recommending links in social networks. In WSDM, pages 635--644, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Bahman Bahmani, Kaushik Chakrabarti, and Dong Xin. Fast personalized pagerank on mapreduce. In SIGMOD, pages 973--984, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Bahman Bahmani, Abdur Chowdhury, and Ashish Goel. Fast incremental and personalized pagerank. VLDB, 4(3):173--184, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Soumen Chakrabarti. Dynamic personalized pagerank in entity-relation graphs. In WWW, pages 571--580, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Fan R. K. Chung and Lincoln Lu. Survey: Concentration inequalities and martingale inequalities: A survey. Internet Mathematics, 3(1):79--127, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  15. Mustafa Coskun, Ananth Grama, and Mehmet Koyuturk. Efficient processing of network proximity queries via chebyshev acceleration. In KDD, pages 1515--1524, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Dániel Fogaras, Balázs Rácz, Károly Csalogány, and Tamás Sarlós. Towards scaling fully personalized pagerank: Algorithms, lower bounds, and experiments. Internet Mathematics, 2(3):333--358, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  17. Yasuhiro Fujiwara, Makoto Nakatsuji, Makoto Onizuka, and Masaru Kitsuregawa. Fast and exact top-k search for random walk with restart. PVLDB, 5(5):442--453, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Yasuhiro Fujiwara, Makoto Nakatsuji, Hiroaki Shiokawa, Takeshi Mishima, and Makoto Onizuka. Efficient ad-hoc search for personalized pagerank. In SIGMOD, pages 445--456, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Yasuhiro Fujiwara, Makoto Nakatsuji, Hiroaki Shiokawa, Takeshi Mishima, and Makoto Onizuka. Fast and exact top-k algorithm for pagerank. In AAAI, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Yasuhiro Fujiwara, Makoto Nakatsuji, Takeshi Yamamuro, Hiroaki Shiokawa, and Makoto Onizuka. Efficient personalized pagerank with accuracy assurance. In KDD, pages 15--23, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Tao Guo, Xin Cao, Gao Cong, Jiaheng Lu, and Xuemin Lin. Distributed algorithms on exact personalized pagerank. In SIGMOD, pages 479--494, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Manish S. Gupta, Amit Pathak, and Soumen Chakrabarti. Fast algorithms for top-k personalized pagerank queries. In WWW, pages 1225--1226, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Pankaj Gupta, Ashish Goel, Jimmy Lin, Aneesh Sharma, Dong Wang, and Reza Zadeh. Wtf: The who to follow service at twitter. In WWW, pages 505--514, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Ali Hadian, Sadegh Nobari, Behrooz Minaei-Bidgoli, and Qiang Qu. Roll: Fast in-memory generation of gigantic scale-free networks. In SIGMOD, pages 1829--1842. ACM, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Kalervo J"arvelin and Jaana Kekäläinen. IR evaluation methods for retrieving highly relevant documents. In SIGIR, pages 41--48, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Glen Jeh and Jennifer Widom. Scaling personalized web search. In WWW, pages 271--279, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Jinhong Jung, Namyong Park, Sael Lee, and U Kang. Bepi: Fast and memory-efficient method for billion-scale random walk with restart. In SIGMOD, pages 789--804, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. David C. Liu, Stephanie Rogers, Raymond Shiau, Dmitry Kislyuk, Kevin C. Ma, Zhigang Zhong, Jenny Liu, and Yushi Jing. Related pins at pinterest: The evolution of a real-world recommender system. In WWW (Companion), pages 583--592, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Peter Lofgren, Siddhartha Banerjee, and Ashish Goel. Bidirectional pagerank estimation: From average-case to worst-case. In WAW, pages 164--176, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Peter Lofgren, Siddhartha Banerjee, and Ashish Goel. Personalized pagerank estimation and search: A bidirectional approach. In WSDM, pages 163--172, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Peter A Lofgren, Siddhartha Banerjee, Ashish Goel, and C Seshadhri. Fast-ppr: Scaling personalized pagerank estimation for large graphs. In KDD, pages 1436--1445, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Takanori Maehara, Takuya Akiba, Yoichi Iwata, and Ken-ichi Kawarabayashi. Computing personalized pagerank quickly by exploiting graph structures. PVLDB, 7(12):1023--1034, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Naoto Ohsaka, Takanori Maehara, and Ken-ichi Kawarabayashi. Efficient pagerank tracking in evolving networks. In KDD, pages 875--884, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. The pagerank citation ranking: bringing order to the web. 1999.Google ScholarGoogle Scholar
  35. CH Ren, Luyi Mo, CM Kao, CK Cheng, and DWL Cheung. Clude: An efficient algorithm for lu decomposition over a sequence of evolving graphs. In EDBT, 2014.Google ScholarGoogle Scholar
  36. Atish Das Sarma, Anisur Rahaman Molla, Gopal Pandurangan, and Eli Upfal. Fast distributed pagerank computation. Theoretical Computer Science, 561:113--121, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Kijung Shin, Jinhong Jung, Lee Sael, and U. Kang. BEAR: block elimination approach for random walk with restart on large graphs. In SIGMOD, pages 1571--1585, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Alastair J. Walker. New fast method for generating discrete random numbers with arbitrary frequency distributions. Electronics Letters, 10(8):127--128, 1974.Google ScholarGoogle ScholarCross RefCross Ref
  39. Sibo Wang, Youze Tang, Xiaokui Xiao, Yin Yang, and Zengxiang Li. Hubppr: Effective indexing for approximate personalized pagerank. PVLDB, 10(3):205--216, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Sibo Wang, Renchi Yang, Xiaokui Xiao, Zhewei Wei, and Yin Yang. FORA: simple and effective approximate single-source personalized pagerank. In KDD, pages 505--514, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Yubao Wu, Ruoming Jin, and Xiang Zhang. Fast and unified local search for random walk based k-nearest-neighbor query in large graphs. In SIGMOD 2014, pages 1139--1150, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Weiren Yu and Xuemin Lin. IRWR: incremental random walk with restart. In SIGIR, pages 1017--1020, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Weiren Yu and Julie A. McCann. Random walk with restart over dynamic graphs. In ICDM, pages 589--598, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  44. Hongyang Zhang, Peter Lofgren, and Ashish Goel. Approximate personalized pagerank on dynamic graphs. In KDD, pages 1315--1324, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Fanwei Zhu, Yuan Fang, Kevin Chen-Chuan Chang, and Jing Ying. Incremental and accuracy-aware personalized pagerank through scheduled approximation. PVLDB, 6(6):481--492, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. TopPPR: Top-k Personalized PageRank Queries with Precision Guarantees on Large Graphs

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGMOD '18: Proceedings of the 2018 International Conference on Management of Data
        May 2018
        1874 pages
        ISBN:9781450347037
        DOI:10.1145/3183713

        Copyright © 2018 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 27 May 2018

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        SIGMOD '18 Paper Acceptance Rate90of461submissions,20%Overall Acceptance Rate785of4,003submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader