ABSTRACT
A parameterized family of non-linear, link analytic ranking functions is proposed that includes Pagerank as a special case and uses the convexity property of those functions to be more resistant to link spam attacks. A contribution of the paper is the construction of such a scheme with provable uniqueness and convergence guarantees. The paper also demonstrates that even in an unlabelled scenario this family can have spam resistance comparable to Trustrank [3] that uses labels of spam or nat-spam on a training set. The proposed method can use labels, if available, to improve its performance to provide state of the art level of link spam protection.
- http://webspam.lip6.fr. 2007.Google Scholar
- Persi Diaconis and R. L. Graham. Spearman's footrule as a measure of disarray. Journal of the Royal Statistical Society. Series B (Methodological), 39(2):262--268, 1977.Google ScholarCross Ref
- Zoltán Gyongyi, Hector Garcia-Molina, and Jan Pedersen. Combating web spam with trustrank. In VLDB, pages 576--587, 2004. Google ScholarDigital Library
Index Terms
- A spam resistant family of concavo-convex ranks for link analysis
Recommendations
Identifying spam link generators for monitoring emerging web spam
WICOW '10: Proceedings of the 4th workshop on Information credibilityIn this paper, we address the question of how we can identify hosts that will generate links to web spam. Detecting such spam link generators is important because almost all new spam links are created by them. By monitoring spam link generators, we can ...
Identifying link farm spam pages
WWW '05: Special interest tracks and posters of the 14th international conference on World Wide WebWith the increasing importance of search in guiding today's web traffic, more and more effort has been spent to create search engine spam. Since link analysis is one of the most important factors in current commercial search engines' ranking systems, ...
Improving web spam classifiers using link structure
AIRWeb '07: Proceedings of the 3rd international workshop on Adversarial information retrieval on the webWeb spam has been recognized as one of the top challenges in the search engine industry [14]. A lot of recent work has addressed the problem of detecting or demoting web spam, including both content spam [16, 12] and link spam [22, 13]. However, any ...
Comments