ABSTRACT
One of the most important features of the Web graph and social networks is that they are constantly evolving. The classical computational paradigm, which assumes a fixed data set as an input to an algorithm that terminates, is inadequate for such settings. In this paper we study the problem of computing PageRank on an evolving graph. We propose an algorithm that, at any moment in the time and by crawling a small portion of the graph, provides an estimate of the PageRank that is close to the true PageRank of the graph at that moment. We will also evaluate our algorithm experimentally on real data sets and on randomly generated inputs. Under a stylized model of graph evolution, we show that our algorithm achieves a provable performance guarantee that is significantly better than the naive algorithm that crawls the nodes in a round-robin fashion.
Supplemental Material
- A. Anagnostopoulos, R. Kumar, M. Mahdian, and E. Upfal. Sort me if you can: How to sort dynamic data. In ICALP, pages 339--350, 2009. Google ScholarDigital Library
- A. Anagnostopoulos, R. Kumar, M. Mahdian, and E. Upfal. Algorithms on evolving graphs. In ITCS, pages 149--160, 2012. Google ScholarDigital Library
- K. Avrachenkov, N. Litvak, D. Nemirovsky, and N. Osipova. Monte Carlo methods in Pagerank computation: When one iteration is sufficient. SIAM J. Numer. Anal., 45(2):890--904, 2007. Google ScholarDigital Library
- B. Bahmani, K. Chakrabarti, and D. Xin. Fast personalized PageRank on MapReduce. In SIGMOD, pages 973--984, 2011. Google ScholarDigital Library
- B. Bahmani, A. Chowdhury, and A. Goel. Fast incremental and personalized PageRank. PVLDB, 4:173--184, 2010. Google ScholarDigital Library
- P. Berkhin. Survey: A survey on PageRank computing. Internet Mathematics, 2(1):73--120, 2005.Google ScholarCross Ref
- Y.-Y. Chen, Q. Gan, and T. Suel. Local methods for estimating PageRank values. In CIKM, pages 381--389, 2004. Google ScholarDigital Library
- S. Chien, C. Dwork, S. Kumar, D. Simon, and D. Sivakumar. Link evolution: Analysis and algorithms. Internet Mathematics, 1(3):277--304, 2003.Google ScholarCross Ref
- A. Das Sarma, S. Gollapudi, and R. Panigrahy. Estimating PageRank on graph streams. In PODS, pages 69--78, 2008. Google ScholarDigital Library
- A. Dasgupta, A. Ghosh, R. Kumar, C. Olston, S. Pandey, and A. Tomkins. The discoverability of the web. In WWW, pages 421--430, 2007. Google ScholarDigital Library
- J. V. Davis and I. S. Dhillon. Estimating the global PageRank of web communities. In KDD, pages 116--125, 2006. Google ScholarDigital Library
- D. Eppstein, Z. Galil, and G. F. Italiano. Dynamic graph algorithms. In M. J. Atallah, editor, Algorithms and Theoretical Computing Handbook. CRC Press, 1999.Google Scholar
- S. Kamvar, T. Haveliwala, and G. Golub. Adaptive methods for the computation of PageRank. Technical report, Stanford University, 2003.Google Scholar
- S. Kamvar, T. Haveliwala, C. Manning, and G. Golub. Exploiting the block structure of the web for computing PageRank. Technical report, Stanford University, 2003.Google Scholar
- S. D. Kamvar, T. H. Haveliwala, C. D. Manning, and G. H. Golub. Extrapolation methods for accelerating PageRank computations. In WWW, pages 261--270, 2003. Google ScholarDigital Library
- R. Kleinberg. Online Decision Problems with Large Strategy Sets. PhD thesis, MIT, 2005. Google ScholarDigital Library
- A. N. Langville and C. D. Meyer. Updating PageRank using the group inverse and stochastic complementation. Technical report, NCSU, Mathematics Department, 2002.Google Scholar
- A. N. Langville and C. D. Meyer. Updating PageRank with iterative aggregation. In WWW (posters), pages 392--393, 2004. Google ScholarDigital Library
- A. N. Langville and C. D. Meyer. Updating Markov Chains with an eye on Google's PageRank. SIAM J. Matrix Anal. Appl., 27(4):968--987, 2006. Google ScholarDigital Library
- F. McSherry. A uniform approach to accelerated PageRank computation. In WWW, pages 575--582, 2005. Google ScholarDigital Library
- S. Muthukrishnan. Data Streams: Algorithms and Applications. Now Publishers Inc, 2005.Google ScholarDigital Library
- C. Olston and S. Pandey. Recrawl scheduling based on information longevity. In WWW, pages 437--446, 2008. Google ScholarDigital Library
- L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: Bringing order to the web. Technical report, Stanford InfoLab, 1999.Google Scholar
- S. Pandey and C. Olston. User-centric web crawling. In WWW, pages 401--411, 2005. Google ScholarDigital Library
- S. Pandey and C. Olston. Crawl ordering by search impact. In WSDM, pages 3--14, 2008. Google ScholarDigital Library
- D. Ron. Property testing: A learning theory perspective. Found. Trends Mach. Learn., 1:307--402, 2008. Google ScholarDigital Library
- D. Ron. Algorithmic and analysis techniques in property testing. Found. Trends Theor. Comput. Sci., 5:73--205, 2010. Google ScholarDigital Library
- H. H. Song, T. W. Cho, V. Dave, Y. Zhang, and L. Qiu. Scalable proximity estimation and link prediction in online social networks. In IMC, pages 322--335, 2009. Google ScholarDigital Library
- H. Tong, S. Papadimitriou, P. S. Yu, and C. Faloutsos. Proximity tracking on time-evolving bipartite graphs. In SDM, pages 704--715, 2008.Google ScholarCross Ref
- J. L. Wolf, M. S. Squillante, P. S. Yu, J. Sethuraman, and L. Ozsen. Optimal crawling strategies for web search engines. In WWW, pages 136--147, 2002. Google ScholarDigital Library
Index Terms
- PageRank on an evolving graph
Recommendations
PageRank in Undirected Random Graphs
WAW 2015: Proceedings of the 12th International Workshop on Algorithms and Models for the Web Graph - Volume 9479PageRank has numerous applications in information retrieval, reputation systems, machine learning, and graph partitioning. In this paper, we study PageRank in undirected random graphs with expansion property. The Chung-Lu random graph represents an ...
A note on the PageRank of undirected graphs
The PageRank is a widely used scoring function of networks in general and of the World Wide Web graph in particular. The PageRank is defined for directed graphs, but in some special cases applications for undirected graphs occur. In the literature it is ...
Supervised Nested PageRank
CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge ManagementGraph-based ranking plays a key role in many applications, such as web search and social computing. Pioneering methods of ranking on graphs (e.g., PageRank and HITS) computed ranking scores relying only on the graph structure. Recently proposed methods, ...
Comments