ABSTRACT
Collaborative prediction involves filling in missing entries of a user-item matrix to predict preferences of users based on their observed preferences. Most of existing models assume that the data is missing at random (MAR), which is often violated in recommender systems in practice. Incorrect assumption on missing data ignores the missing data mechanism, leading to biased inferences and prediction. In this paper we present a Bayesian binomial mixture model for collaborative prediction, where the generative process for data and missing data mechanism are jointly modeled to handle non-random missing data. Missing data mechanism is modeled by three factors, each of which is related to users, items, and rating values. Each factor is modeled by Bernoulli random variable, and the observation of rating value is determined by the Boolean OR operation of three binary variables. We develop computationally-efficient variational inference algorithms, where variational parameters have closed-form update rules and the computational complexity depends on the number of observed ratings, instead of the size of the rating data matrix. We also discuss implementation issues on hyperparameter tuning and estimation based on empirical Bayes. Experiments on Yahoo! Music and MovieLens datasets confirm the useful behavior of our model by demonstrating that: (1) it outperforms state-of-the-art methods in yielding higher predictive performance; (2) it finds meaningful solutions instead of undesirable boundary solutions; (3) it provides rating trend analysis on why ratings are observed.
Supplemental Material
- E. Candès and T. Tao. The power of convex relaxation: Near-optimal matrix completion. IEEE Transactions on Information Theory, 56(5):2053--2080, 2009. Google ScholarDigital Library
- G. Dror, N. Koenigstein, Y. Koren, and M. Weimer. The yahoo! music dataset and KDD-Cup'11. In Proceedings of KDD Cup and Workshop, 2011.Google Scholar
- Y.-D. Kim and S. Choi. Variational Bayesian view of weighted trace norm regularization for matrix factorization. IEEE Signal Processing Letters, 20(3):261--264, 2013.Google ScholarCross Ref
- Y.-D. Kim and S. Choi. Scalable variational Bayesian matrix factorization with side information. In Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), Reykjavik, Iceland, 2014.Google Scholar
- Y. J. Lim and Y. W. Teh. Variational Bayesian approach to movie rating prediction. In Proceedings of KDD Cup and Workshop, San Jose, CA, 2007.Google Scholar
- G. Ling, H. Yang, M. R. Lyu, and I. King. Response aware model-based collaborative filtering. In Proceedings of the Annual Conference on Uncertainty in Artificial Intelligence (UAI), Catalina Island, California, USA, 2012.Google Scholar
- R. J. A. Little and D. B. Rubin. Statistical Analysis with Missing Data. John Willey & Sons, Inc., 1987. Google ScholarDigital Library
- B. M. Marlin. Modeling user rating profiles for collaborative filtering. In Advances in Neural Information Processing Systems (NIPS), volume 16, 2004.Google Scholar
- B. M. Marlin. Missing Data Problems in Machine Learning. PhD thesis, University of Toronto, 2008. Google ScholarDigital Library
- B. M. Marlin and R. S. Zemel. Collaborative prediction and ranking with non-random missing data. In Proceedings of the ACM International Conference on Recommender Systems (RecSys), New York, New York, USA, 2009. Google ScholarDigital Library
- B. M. Marlin, R. S. Zemel, S. T. Roweis, and M. Slaney. Collaborative filtering and the missing at random assumption. In Proceedings of the Annual Conference on Uncertainty in Artificial Intelligence (UAI), Vancouver, Canada, 2007.Google Scholar
- T. Minka. Estimating a Dirichlet distribution. Technical report, Microsoft Research, 2000.Google Scholar
- U. Paquet and N. Koenigstein. One-class collaborative filtering with random graphs. In Proceedings of the International Conference on World Wide Web (WWW), Rio de Janeiro, Brazil, 2013. Google ScholarDigital Library
- S. Park, Y.-D. Kim, and S. Choi. Hierarchical Bayesian matrix factorization with side information. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), Beijing, China, 2013. Google ScholarDigital Library
- T. Raiko, A. Ilin, and J. Karhunen. Principal component analysis for large scale problems with lots of missing values. In Proceedings of the European Conference on Machine Learning (ECML), pages 691--698, Warsaw, Poland, 2007. Google ScholarDigital Library
- R. Salakhutdinov and A. Mnih. Bayesian probablistic matrix factorization using MCMC. In Proceedings of the International Conference on Machine Learning (ICML), Helsinki, Finland, 2008. Google ScholarDigital Library
- R. Salakhutdinov, A. Mnih, and G. Hinton. Restricted Boltzmann machines for collaborative filtering. In Proceedings of the International Conference on Machine Learning (ICML), Corvallis, OR, USA, 2007. Google ScholarDigital Library
- R. Salakhutdinov and N.Srebro. Collaborative filtering in a non-uniform world: Learning with weighted trace norm. In Advances in Neural Information Processing Systems (NIPS), volume 23. MIT Press, 2010.Google Scholar
- H. Steck. Training and testing of recommender systems on data missing not at random. In Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), Washington, DC, USA, 2010. Google ScholarDigital Library
- J. Wu. Binomial matrix factorization for discrete collaborative filtering. In Proceedings of the IEEE International Conference on Data Mining (ICDM), Miami, Florida, USA, 2009. Google ScholarDigital Library
- J. Yoo and S. Choi. Bayesian matrix co-factorization: Variational algorithm and Cramér-Rao bound. In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD), Athens, Greece, 2011. Google ScholarDigital Library
Index Terms
- Bayesian binomial mixture model for collaborative prediction with non-random missing data
Recommendations
Collaborative prediction and ranking with non-random missing data
RecSys '09: Proceedings of the third ACM conference on Recommender systemsA fundamental aspect of rating-based recommender systems is the observation process, the process by which users choose the items they rate. Nearly all research on collaborative filtering and recommender systems is founded on the assumption that missing ...
Collaborative filtering based on an iterative prediction method to alleviate the sparsity problem
iiWAS '09: Proceedings of the 11th International Conference on Information Integration and Web-based Applications & ServicesCollaborative filtering (CF) is one of the most popular recommender system technologies. It tries to identify users that have relevant interests and preferences by calculating similarities among user profiles. The idea behind this method is that, it may ...
Merging trust in collaborative filtering to alleviate data sparsity and cold start
Providing high quality recommendations is important for e-commerce systems to assist users in making effective selection decisions from a plethora of choices. Collaborative filtering is a widely accepted technique to generate recommendations based on ...
Comments