ABSTRACT
Link analysis is a key technology in contemporary web search engines. Most of the previous work on link analysis only used information from one snapshot of web graph. Since commercial search engines crawl the Web periodically, they will naturally obtain time series data of web graphs. The historical information contained in the series of web graphs can be used to improve the performance of link analysis. In this paper, we argue that page importance should be a dynamic quantity, and propose defining page importance as a function of both PageRank of the current web graph and accumulated historical page importance from previous web graphs. Specifically, a novel algorithm named TemporalRank is designed to compute the proposed page importance. We try to use a kinetic model to interpret this page importance and show that it can be regarded as the solution to an ordinary differential equation. Experiments on link analysis using web graph data in five snapshots show that the proposed algorithm can outperform PageRank in many measures, and can effectively filter out newly appeared link spam websites.
- Berberich, K., Vazirgiannis, M., and Weikum, G. T-Rank: Time-aware Authority Ranking. In Algorithms and Models for the Web-Graph: Third International Workshop, WAW 2004, pages: 131--141, Springer-Verlag, 2004.Google Scholar
- Boldi, P., Santini, M., and Vigna, S. PageRank as a function of the damping factor. In Proceedings of the 14th International World Wide Web Conference, 2005. Google ScholarDigital Library
- Brin, S., and Page, L. The anatomy of a large-scale hypertextual web search engine. In Proceedings of the Seventh International Wide Web Conference, Australia, 1998. Google ScholarDigital Library
- Gyongyi, Z., and Garcia-Molina, H. Link spam alliances. Technical Report, Stanford University, 2005.Google Scholar
- Gyongyi, Z., and Garcia-Molina, H. Web spam Taxonomy. In the First International Workshop on Adversarial Information Retrieval on the Web, 2005.Google Scholar
- Haveliwala, T. Topic-sensitive PageRank. In Proceedings of the International World Wide Web Conference, 2002. Google ScholarDigital Library
- Haveliwala, T., Kamvar, S., and Jeh, G. An analytical comparison of approaches to personalizing PageRank. Technical Report, Stanford University, 2003.Google Scholar
- Kleinberg, J. Authoritative sources in a hyperlinked environment. In Journal of the ACM, 46(5):604--632, 1999. Google ScholarDigital Library
- Langville, A., and Meyer, C. Deeper inside PageRank. Internet Mathematics 1(3):335--380, 2004.Google ScholarCross Ref
- McSherry, F. A uniform approach to accelerated PageRank computation. In Proceedings of the 14th International World Wide Web Conference, 2005. Google ScholarDigital Library
- Page, L., Brin, S., Motwani, R., and Winograd, T. The PageRank citation ranking: Bringing order to the web. Technical Report, Stanford University, Stanford, CA, 1998.Google Scholar
- Richardson, M., Prakash, A., and Brill, E. Beyond PageRank: Machine Learning for Static Ranking. In Proceedings of the Fifteenth International World Wide Web Conference, pages: 707--715, 2006. Google ScholarDigital Library
- Yu, P.S., Li, X., and Liu, B. Adding the Temporal Dimension to Search - A Case Study in Publication Search. In Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence, 2005. Google ScholarDigital Library
Index Terms
- Link analysis using time series of web graphs
Recommendations
Searching the Web
We offer an overview of current Web search engine design. After introducing a generic search engine architecture, we examine each engine component in turn. We cover crawling, local Web page storage, indexing, and the use of link analysis for boosting ...
Time-weighted web authoritative ranking
AbstractWe investigate temporal factors in assessing the authoritativeness of web pages. We present three different metrics related to time: age, event, and trend. These metrics measure recentness, special event occurrence, and trend in revisions, ...
A Googol of Information about Google
Timothy P. Chartier reviews Google's PageRank and Beyond: The Science of Search Engine Rankings by Amy Langville and Carl Meyer.
Comments