Relevance is more significant than correlation: Information filtering on sparse data

, , , and

Published 7 January 2010 Europhysics Letters Association
, , Citation Ming-Sheng Shang et al 2009 EPL 88 68008 DOI 10.1209/0295-5075/88/68008

0295-5075/88/6/68008

Abstract

In some recommender systems where users can vote objects by ratings, the similarity between users can be quantified by a benchmark index, namely the Pearson correlation coefficient, which reflects the rating correlations. Another alternative way is to calculate the similarity based solely on the relevance information, namely whether a user has voted an object. The former one uses more information than the latter, and is intuitively expected to give more accurate rating predictions under the standard collaborative filtering framework. However, according to the extensive experimental analysis, this letter reports the opposite results that the latter method, making use of only the relevance information, can outperform the former method, especially when the data set is sparse. Our finding challenges the routine knowledge on information filtering, and suggests some alternatives to address the sparsity problem.

Export citation and abstract BibTeX RIS

10.1209/0295-5075/88/68008