skip to main content
10.1145/2783258.2783297acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Efficient PageRank Tracking in Evolving Networks

Published:10 August 2015Publication History

ABSTRACT

Real-world networks, such as the World Wide Web and online social networks, are very large and are evolving rapidly. Thus tracking personalized PageRank in such evolving networks is an important challenge in network analysis and graph mining.

In this paper, we propose an efficient online algorithm for tracking personalized PageRank in an evolving network. The proposed algorithm tracks personalized PageRank accurately (i.e., within a given accuracy ε > 0). Moreover it can update the personalized PageRank scores in amortized O(1/ε) iterations for each graph modification. In addition, when m edges are randomly and sequentially inserted, the total number of iterations is expected to be O(log m/ε).

We evaluated our algorithm in real-world networks. In average case, for each edge insertion and deletion, our algorithm updated the personalized PageRank in 3us in a web graph with 105M vertices and 3.7B edges, and 20ms in a social network with 42M vertices and 1.5B edges. By comparing existing state-of-the-arts algorithms, our algorithm is 2--290 times faster with an equal accuracy.

Skip Supplemental Material Section

Supplemental Material

p875.mp4

mp4

162.2 MB

References

  1. R. Andersen, F. Chung, and K. Lang. Local graph partitioning using pagerank vectors. In FOCS, pages 475--486, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. B. Bahmani, A. Chowdhury, and A. Goel. Fast incremental and personalized pagerank. VLDB, 4(3):173--184, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. B. Bahmani, R. Kumar, M. Mahdian, and E. Upfal. Pagerank on an evolving graph. In KDD, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A.-L. Barabási and R. Albert. Emergence of scaling in random networks. Science, 286(5439):509--512, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  5. A. A. Benczúr, K. Csalogány, T. Sarlós, and M. Uher. SpamRank--fully automatic link spam detection work in progress. In AIRWeb, 2005.Google ScholarGoogle Scholar
  6. P. Berkhin. Bookmark-coloring approach to personalized pagerank computing. Internet Math., 3(1):41--62, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  7. S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. Comput. Networks ISDN Syst., 30(1):107--117, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Cha, H. Haddadi, F. Benevenuto, and K. P. Gummadi. Measuring user influence in twitter: The million follower fallacy. In ICWSM, pages 10--17, 2010.Google ScholarGoogle Scholar
  9. S. Chien, C. Dwork, R. Kumar, D. R. Simon, and D. Sivakumar. Link evolution: Analysis and algorithms. Internet Math., 1(3):277--304, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  10. Y. Chung, M. Toyoda, and M. Kitsuregawa. Detecting link hijacking by web spammers. In PAKDD, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Y. Chung, M. Toyoda, and M. Kitsuregawa. A study of link farm distribution and evolution using a time series of web snapshots. In AIRWeb, pages 9--16, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Y. Chung, M. Toyoda, and M. Kitsuregawa. Identifying spam link generators for monitoring emerging web spam. In WICOW, pages 51--58, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. G. M. Del Corso, A. Gullí, and F. Romani. Fast pagerank computation via a sparse linear system. Internet Math., 2(3):251--273, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  14. Y. Fujiwara, M. Nakatsuji, M. Onizuka, and M. Kitsuregawa. Fast and exact top-k search for random walk with restart. VLDB, 5(5):442--453, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. Gleich, L. Zhukov, and P. Berkhin. Fast parallel pagerank: A linear system approach. Yahoo! Research Technical Report YRL-2004-038, 13:22, 2004.Google ScholarGoogle Scholar
  16. Z. Gyöngyi and H. Garcia-Molina. Link spam alliances. In VLDB, pages 517--528, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Z. Gyöngyi, H. Garcia-Molina, and J. Pedersen. Combating web spam with trustrank. In VLDB, pages 576--587, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. W. Hoeffding. Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc., 58(301):13--30, 1963.Google ScholarGoogle ScholarCross RefCross Ref
  19. G. Jeh and J. Widom. Scaling personalized web search. In WWW, pages 271--279, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. D. Kamvar, T. H. Haveliwala, C. D. Manning, and G. H. Golub. Extrapolation methods for accelerating pagerank computations. In WWW, pages 261--270, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. N. Langville and C. D. Meyer. Deeper inside pagerank. Internet Math., 1(3):335--380, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  22. A. N. Langville and C. D. Meyer. Updating markov chains with an eye on google's pagerank. SIAM J. Matrix Anal. Appl., 27(4):968--987, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. P. Lofgren, S. Banerjee, A. Goel, and C. Seshadhri. FAST-PPR: Scaling personalized pagerank estimation for large graphs. In KDD, pages 1436--1445, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. T. Maehara, T. Akiba, Y. Iwata, and K. Kawarabayashi. Computing personalized pagerank quickly by exploiting graph structures. VLDB, 7(12), 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. F. McSherry. A uniform approach to accelerated pagerank computation. In WWW, pages 575--582, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford InfoLab, 1999.Google ScholarGoogle Scholar
  27. A. D. Sarma, S. Gollapudi, and R. Panigrahy. Estimating pagerank on graph streams. J. ACM, 58(3):13, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. G. Shen, B. Gao, T.-Y. Liu, G. Feng, S. Song, and H. Li. Detecting link spam using temporal information. In ICDM, pages 1049--1053, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. H. A. Simon and A. Ando. Aggregation of variables in dynamic systems. Econometrica, 29:114--138, 1961.Google ScholarGoogle ScholarCross RefCross Ref
  30. R. V. Southwell. Relaxation Methods in Engineering Science. Oxford University Press, 1940.Google ScholarGoogle Scholar
  31. R. V. Southwell. Relaxation Methods in Theoretical Physics. Oxford University Press, 1946.Google ScholarGoogle Scholar
  32. D. A. Spielman and S.-H. Teng. A local clustering algorithm for massive graphs and its application to nearly linear time graph partitioning. SIAM J. Comput., 42(1):1--26, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. J. Weng, E. P. Lim, J. Jiang, and Q. He. TwitterRank: finding topic-sensitive influential twitterers. In WSDM, pages 261--270, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Efficient PageRank Tracking in Evolving Networks

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
      August 2015
      2378 pages
      ISBN:9781450336642
      DOI:10.1145/2783258

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 10 August 2015

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      KDD '15 Paper Acceptance Rate160of819submissions,20%Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

      KDD '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader