ABSTRACT
Collaborative filtering identifies information interest of a particular user based on the information provided by other similar users. The memory-based approaches for collaborative filtering (e.g., Pearson correlation coefficient approach) identify the similarity between two users by comparing their ratings on a set of items. In these approaches, different items are weighted either equally or by some predefined functions. The impact of rating discrepancies among different users has not been taken into consideration. For example, an item that is highly favored by most users should have a smaller impact on the user-similarity than an item for which different types of users tend to give different ratings. Even though simple weighting methods such as variance weighting try to address this problem, empirical studies have shown that they are ineffective in improving the performance of collaborative filtering. In this paper, we present an optimization algorithm to automatically compute the weights for different items based on their ratings from training users. More specifically, the new weighting scheme will create a clustered distribution for user vectors in the item space by bringing users of similar interests closer and separating users of different interests more distant. Empirical studies over two datasets have shown that our new weighting scheme substantially improves the performance of the Pearson correlation coefficient method for collaborative filtering.
- J. S. Breese, D. Heckerman and C. Kadie, Empirical Analysis of Predictive Algorithms for Collaborative Filtering, Proceeding of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI), 1998. Google ScholarDigital Library
- J. L. Herlocker, J. A. Konstan, A. Brochers and J. Riedl. An Algorithm Framework for Performing Collaborative Filtering. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 1999. Google ScholarDigital Library
- I. M. Soboroff and C. Nicholas. Collaborative Filtering and the Generalized Vector Space Model. In Proceedings of the 23rd Annual International Conference on Researech and Development in Information Retrieval (SIGIR), 2000. Google ScholarDigital Library
- P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom and J. Riedl, Grouplens: An Open Architecture for Collaborative Filtering of Netnews. In Proceedings of the ACM 1994 Conference on Computer Supported Cooperative Work, pages 175--186, 1994. Google ScholarDigital Library
- T. Hofmann and J. Puzicha, Latent Class Models for Collaborative Filtering, In Proceedings of International Joint Conference on Artificial Intelligence (UAI), 1999. Google ScholarDigital Library
- I. M. Soboroff and Charles K. Nicholas. Combining Content and Collaboration in Text Filtering. In Proceedings of the IJCAI'99 Workshop on Machine Learning for Information Filtering, 1999.Google Scholar
- P. Melville, R. J. Mooney, R. Nagarajan, Content-Boosted Collaborative Filtering for Improved Recommendations. In Proceedings of the the Eighteenth National Conference on Artificial Intelligence (AAAI), 2002. Google ScholarDigital Library
- SWAMI: a framework for collaborative filtering algorithm development and evaluation. In Proceedings of the 23rd Annual International Conference on Researech and Development in Information Retrieval (SIGIR), 2000. Google ScholarDigital Library
- http://www.cs.usyd.edu.au/ irena/movie_data.zipGoogle Scholar
- http://research.compaq.com/SRC/eachmovieGoogle Scholar
- A. Dempster, N. Laird and D. Rubin. Maximum Likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, B 39:1--38, 1977.Google ScholarCross Ref
- D. M. Pennock, E. Horvitz, S. Lawrence and C. L. Giles, Collaborative Filtering by Personality Diagnosis: A Hybrid Memory- and Model-Based Approach, in Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI), 2000. Google ScholarDigital Library
- K. Yu, Z. Wen, X.W Xu, and M. Ester, Feature Weighting and Instance Selection for Collaborative Filtering, in Proceedings of 2nd International Workshop on Management of Information on the Web: Web Data and Text Mining (MIW'01), 2001. Google ScholarDigital Library
- T. Hofmann, Gaussian Latent Semantic Models for Collaborative Filtering, In the Proceedings of the 26th Annual International ACM SIGIR Conference, 2003.Google Scholar
- L. Si and R. Jin, A Flexible Mixture Model for Collaborative Filtering, In the Proceedings of the Twentieth International Conference on Machine Learning (ICML 2003), 2003.Google Scholar
- A. Berger, V. Pietra and S. Pietra, A Maximum Entropy Approach to Natural Language Processing. In Computational Linguistics, 22:39--71, 1996. Google ScholarDigital Library
- V. Rejisbergen, Information Retrieval, 1979. Google ScholarDigital Library
- A.Y. Ng and M.I. Jordan. On Discriminative vs. Generative Classifiers: A Comparison of Logistic Regression and Naive Bayes. NIPS 14, 2002.Google Scholar
- G. Salton and C. Buckley, Term-weighting Approaches in Automatic Text Retrieval. Information Processing and Management, 24, 513--523, 1998. Google ScholarDigital Library
- D.D. Lewis, Y. Yang, T. Rose and F. Li. RCV1: A New Text Categorization Test Collection. Journal of Machine Learning Research 2003.Google Scholar
Index Terms
- An automatic weighting scheme for collaborative filtering
Recommendations
Analyzing weighting schemes in collaborative filtering: cold start, post cold start and power users
SAC '12: Proceedings of the 27th Annual ACM Symposium on Applied ComputingCollaborative filtering recommender systems provide their users with relevant items based on information from other similar users. Popular collaborative filtering approaches such as Pearson correlation coefficient and cosine similarity, compute the ...
Collaborative filtering based on an iterative prediction method to alleviate the sparsity problem
iiWAS '09: Proceedings of the 11th International Conference on Information Integration and Web-based Applications & ServicesCollaborative filtering (CF) is one of the most popular recommender system technologies. It tries to identify users that have relevant interests and preferences by calculating similarities among user profiles. The idea behind this method is that, it may ...
Trust-based collaborative filtering: tackling the cold start problem using regular equivalence
RecSys '18: Proceedings of the 12th ACM Conference on Recommender SystemsUser-based Collaborative Filtering (CF) is one of the most popular approaches to create recommender systems. This approach is based on finding the most relevant k users from whose rating history we can extract items to recommend. CF, however, suffers ...
Comments