2010 | OriginalPaper | Chapter
Combating Link Spam by Noisy Link Analysis
Authors : Yitong Wang, Xiaofei Chen, Xiaojun Feng
Published in: Advanced Data Mining and Applications
Publisher: Springer Berlin Heidelberg
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
Link Spam has indentified as one of the major obstacles for link-based ranking algorithms of modern search engine since it intently constructs hyperlink structure to help some poor-content pages obtaining undeserved high rank. This problem is even worse with the advent of wikis, blogs and forum that are rich in links. Existing works on link spam are mainly focused on link spam detection by extracting some special link structures (e.g. clique, tight bipartite etc.). However, link spam structures could have many variations and easily make the existing detection methods ineffective. In this paper, we tackle the problem of link spam from a more fundamental viewpoint—“noisy link” analysis. First of all, how “non-voting” hyperlinks affect the quality of ranking is investigated, and then based on this investigation, an approach to detect and process “noisy link” both effectively and automatically is proposed. We also compare our work with two other related works (TrustRank and Site-level Noise removal) on two real web datasets. The experimental results demonstrate that the proposed “noisy link” analysis is very effective on both spam page filtering and final ranking improvement.