Article

An automatic weighting scheme for collaborative filtering

Authors:
Rong Jin

Michigan State University, East Lansing, MI

Michigan State University, East Lansing, MI
View Profile

,
Joyce Y. Chai

Michigan State University, East Lansing, MI

Michigan State University, East Lansing, MI
View Profile

,
Luo Si

Carnegie Mellon University, Pittsburgh, PA

Carnegie Mellon University, Pittsburgh, PA
View Profile

SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrievalJuly 2004Pages 337–344https://doi.org/10.1145/1008992.1009051

Published:25 July 2004Publication History

SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval

Pages 337–344

ABSTRACT

Collaborative filtering identifies information interest of a particular user based on the information provided by other similar users. The memory-based approaches for collaborative filtering (e.g., Pearson correlation coefficient approach) identify the similarity between two users by comparing their ratings on a set of items. In these approaches, different items are weighted either equally or by some predefined functions. The impact of rating discrepancies among different users has not been taken into consideration. For example, an item that is highly favored by most users should have a smaller impact on the user-similarity than an item for which different types of users tend to give different ratings. Even though simple weighting methods such as variance weighting try to address this problem, empirical studies have shown that they are ineffective in improving the performance of collaborative filtering. In this paper, we present an optimization algorithm to automatically compute the weights for different items based on their ratings from training users. More specifically, the new weighting scheme will create a clustered distribution for user vectors in the item space by bringing users of similar interests closer and separating users of different interests more distant. Empirical studies over two datasets have shown that our new weighting scheme substantially improves the performance of the Pearson correlation coefficient method for collaborative filtering.

References

J. S. Breese, D. Heckerman and C. Kadie, Empirical Analysis of Predictive Algorithms for Collaborative Filtering, Proceeding of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI), 1998. Google ScholarDigital Library
J. L. Herlocker, J. A. Konstan, A. Brochers and J. Riedl. An Algorithm Framework for Performing Collaborative Filtering. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 1999. Google ScholarDigital Library
I. M. Soboroff and C. Nicholas. Collaborative Filtering and the Generalized Vector Space Model. In Proceedings of the 23rd Annual International Conference on Researech and Development in Information Retrieval (SIGIR), 2000. Google ScholarDigital Library
P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom and J. Riedl, Grouplens: An Open Architecture for Collaborative Filtering of Netnews. In Proceedings of the ACM 1994 Conference on Computer Supported Cooperative Work, pages 175--186, 1994. Google ScholarDigital Library
T. Hofmann and J. Puzicha, Latent Class Models for Collaborative Filtering, In Proceedings of International Joint Conference on Artificial Intelligence (UAI), 1999. Google ScholarDigital Library
I. M. Soboroff and Charles K. Nicholas. Combining Content and Collaboration in Text Filtering. In Proceedings of the IJCAI'99 Workshop on Machine Learning for Information Filtering, 1999.Google Scholar
P. Melville, R. J. Mooney, R. Nagarajan, Content-Boosted Collaborative Filtering for Improved Recommendations. In Proceedings of the the Eighteenth National Conference on Artificial Intelligence (AAAI), 2002. Google ScholarDigital Library
SWAMI: a framework for collaborative filtering algorithm development and evaluation. In Proceedings of the 23rd Annual International Conference on Researech and Development in Information Retrieval (SIGIR), 2000. Google ScholarDigital Library
http://www.cs.usyd.edu.au/ irena/movie_data.zipGoogle Scholar
http://research.compaq.com/SRC/eachmovieGoogle Scholar
A. Dempster, N. Laird and D. Rubin. Maximum Likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, B 39:1--38, 1977.Google ScholarCross Ref
D. M. Pennock, E. Horvitz, S. Lawrence and C. L. Giles, Collaborative Filtering by Personality Diagnosis: A Hybrid Memory- and Model-Based Approach, in Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI), 2000. Google ScholarDigital Library
K. Yu, Z. Wen, X.W Xu, and M. Ester, Feature Weighting and Instance Selection for Collaborative Filtering, in Proceedings of 2nd International Workshop on Management of Information on the Web: Web Data and Text Mining (MIW'01), 2001. Google ScholarDigital Library
T. Hofmann, Gaussian Latent Semantic Models for Collaborative Filtering, In the Proceedings of the 26th Annual International ACM SIGIR Conference, 2003.Google Scholar
L. Si and R. Jin, A Flexible Mixture Model for Collaborative Filtering, In the Proceedings of the Twentieth International Conference on Machine Learning (ICML 2003), 2003.Google Scholar
A. Berger, V. Pietra and S. Pietra, A Maximum Entropy Approach to Natural Language Processing. In Computational Linguistics, 22:39--71, 1996. Google ScholarDigital Library
V. Rejisbergen, Information Retrieval, 1979. Google ScholarDigital Library
A.Y. Ng and M.I. Jordan. On Discriminative vs. Generative Classifiers: A Comparison of Logistic Regression and Naive Bayes. NIPS 14, 2002.Google Scholar
G. Salton and C. Buckley, Term-weighting Approaches in Automatic Text Retrieval. Information Processing and Management, 24, 513--523, 1998. Google ScholarDigital Library
D.D. Lewis, Y. Yang, T. Rose and F. Li. RCV1: A New Text Categorization Test Collection. Journal of Machine Learning Research 2003.Google Scholar

Index Terms

An automatic weighting scheme for collaborative filtering
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Document filtering
      2. Information extraction

Recommendations

Analyzing weighting schemes in collaborative filtering: cold start, post cold start and power users
SAC '12: Proceedings of the 27th Annual ACM Symposium on Applied Computing

Collaborative filtering recommender systems provide their users with relevant items based on information from other similar users. Popular collaborative filtering approaches such as Pearson correlation coefficient and cosine similarity, compute the ...
Read More
Collaborative filtering based on an iterative prediction method to alleviate the sparsity problem
iiWAS '09: Proceedings of the 11th International Conference on Information Integration and Web-based Applications & Services

Collaborative filtering (CF) is one of the most popular recommender system technologies. It tries to identify users that have relevant interests and preferences by calculating similarities among user profiles. The idea behind this method is that, it may ...
Read More
Trust-based collaborative filtering: tackling the cold start problem using regular equivalence
RecSys '18: Proceedings of the 12th ACM Conference on Recommender Systems

User-based Collaborative Filtering (CF) is one of the most popular approaches to create recommender systems. This approach is based on finding the most relevant k users from whose rating history we can extract items to recommend. CF, however, suffers ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
July 2004
624 pages
ISBN:1581138814
DOI:10.1145/1008992
General Chair:
Mark Sanderson
University of Sheffield (UK)
,
Program Chairs:
Kalervo Järvelin
University of Tampere (Finland)
,
James Allan
University of Massachusetts (USA)
,
Peter Bruza
Distributed Systems Technology Centre (Australia)
Copyright © 2004 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 July 2004
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
collaborative filtering
item weighting scheme
leave one out method
memory-based approach
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate792of3,983submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 204
  Total Citations
  View Citations
- 1,706
  Total Downloads
- Downloads (Last 12 months)22
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

An automatic weighting scheme for collaborative filtering

SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Analyzing weighting schemes in collaborative filtering: cold start, post cold start and power users

Collaborative filtering based on an iterative prediction method to alleviate the sparsity problem

Trust-based collaborative filtering: tackling the cold start problem using regular equivalence