ABSTRACT
We extend variational autoencoders (VAEs) to collaborative filtering for implicit feedback. This non-linear probabilistic model enables us to go beyond the limited modeling capacity of linear factor models which still largely dominate collaborative filtering research.We introduce a generative model with multinomial likelihood and use Bayesian inference for parameter estimation. Despite widespread use in language modeling and economics, the multinomial likelihood receives less attention in the recommender systems literature. We introduce a different regularization parameter for the learning objective, which proves to be crucial for achieving competitive performance. Remarkably, there is an efficient way to tune the parameter using annealing. The resulting model and learning algorithm has information-theoretic connections to maximum entropy discrimination and the information bottleneck principle. Empirically, we show that the proposed approach significantly outperforms several state-of-the-art baselines, including two recently-proposed neural network approaches, on several real-world datasets. We also provide extended experiments comparing the multinomial likelihood with other commonly used likelihood functions in the latent factor collaborative filtering literature and show favorable results. Finally, we identify the pros and cons of employing a principled Bayesian inference approach and characterize settings where it provides the most significant improvements.
- Alexander Alemi, Ian Fischer, Joshua Dillon, and Kevin Murphy. 2017. Deep Variational Information Bottleneck. In 5th International Conference on Learning Representations.Google Scholar
- Amjad Almahairi, Kyle Kastner, Kyunghyun Cho, and Aaron Courville. 2015. Learning distributed representations from reviews for collaborative filtering Proceedings of the 9th ACM Conference on Recommender Systems. ACM, 147--154. Google ScholarDigital Library
- Thierry Bertin-Mahieux, Daniel P.W. Ellis, Brian Whitman, and Paul Lamere. 2011. The Million Song Dataset.. In ISMIR, Vol. Vol. 2. 10.Google Scholar
- David M. Blei, Alp Kucukelbir, and Jon D. McAuliffe. 2017. Variational Inference: A Review for Statisticians. J. Amer. Statist. Assoc. Vol. 112, 518 (2017), 859--877.Google ScholarCross Ref
- David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent dirichlet allocation. Journal of Machine Learning Research Vol. 3, Jan (2003), 993--1022. Google ScholarDigital Library
- Aleksandar Botev, Bowen Zheng, and David Barber. 2017. Complementary Sum Sampling for Likelihood Approximation in Large Scale Classification. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. 1030--1038.Google Scholar
- Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz, and Samy Bengio. 2015. Generating sentences from a continuous space. arXiv preprint arXiv:1511.06349 (2015).Google Scholar
- Sotirios Chatzis, Panayiotis Christodoulou, and Andreas S. Andreou. 2017. Recurrent Latent Variable Networks for Session-Based Recommendation Proceedings of the 2nd Workshop on Deep Learning for Recommender Systems. Google ScholarDigital Library
- Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep neural networks for youtube recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems. ACM, 191--198. Google ScholarDigital Library
- Carl Doersch. 2016. Tutorial on variational autoencoders. arXiv preprint arXiv:1606.05908 (2016).Google Scholar
- Kostadin Georgiev and Preslav Nakov. 2013. A non-IID Framework for Collaborative Filtering with Restricted Boltzmann Machines Proceedings of the 30th International Conference on Machine Learning. 1148--1156. Google ScholarDigital Library
- Samuel Gershman and Noah Goodman. 2014. Amortized inference in probabilistic reasoning. In Proceedings of the Cognitive Science Society, Vol. Vol. 36.Google Scholar
- Prem Gopalan, Jake M. Hofman, and David M. Blei. 2015. Scalable Recommendation with Hierarchical Poisson Factorization Uncertainty in Artificial Intelligence. Google ScholarDigital Library
- Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web. 173--182. Google ScholarDigital Library
- Balázs Hidasi and Alexandros Karatzoglou. 2017. Recurrent Neural Networks with Top-k Gains for Session-based Recommendations. arXiv preprint arXiv:1706.03847 (2017).Google Scholar
- Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2015. Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939 (2015).Google Scholar
- Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. 2017. β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework 5th International Conference on Learning Representations.Google Scholar
- Matthew D. Hoffman and Matthew J. Johnson. 2016. ELBO surgery: yet another way to carve up the variational evidence lower bound Workshop in Advances in Approximate Bayesian Inference, NIPS.Google Scholar
- Yifan Hu, Yehuda Koren, and Chris Volinsky. 2008. Collaborative filtering for implicit feedback datasets Data Mining, 2008. ICDM'08. Eighth IEEE International Conference on. 263--272. Google ScholarDigital Library
- Tommi Jaakkola, Marina Meila, and Tony Jebara. 2000. Maximum entropy discrimination. In Advances in Neural Information Processing Systems. 470--476. Google ScholarDigital Library
- Kalervo J"arvelin and Jaana Kek"al"ainen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS) Vol. 20, 4 (2002), 422--446. Google ScholarDigital Library
- Michael I. Jordan, Zoubin Ghahramani, Tommi S. Jaakkola, and Lawrence K. Saul. 1999. An introduction to variational methods for graphical models. Machine learning Vol. 37, 2 (1999), 183--233. Google ScholarDigital Library
- Diederik Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
- Diederik P. Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).Google Scholar
- Rahul G. Krishnan, Dawen Liang, and Matthew D. Hoffman. 2017. On the challenges of learning with inference networks on sparse, high-dimensional data. arXiv preprint arXiv:1710.06085 (2017).Google Scholar
- Mark Levy and Kris Jack. 2013. Efficient top-n recommendation by linear regression RecSys Large Scale Recommender Systems Workshop.Google Scholar
- Dawen Liang, Jaan Altosaar, Laurent Charlin, and David M. Blei. 2016. Factorization meets the item embedding: Regularizing matrix factorization with item co-occurrence. In Proceedings of the 10th ACM conference on recommender systems. 59--66. Google ScholarDigital Library
- Dawen Liang, Minshu Zhan, and Daniel P.W. Ellis. 2015. Content-Aware Collaborative Music Recommendation Using Pre-trained Neural Networks. ISMIR. 295--301.Google Scholar
- Benjamin Marlin. 2004. Collaborative filtering: A machine learning perspective. University of Toronto.Google Scholar
- Daniel McFadden et almbox.. 1973. Conditional logit analysis of qualitative choice behavior. (1973), bibinfonumpages105--142 pages.Google Scholar
- Yishu Miao, Lei Yu, and Phil Blunsom. 2016. Neural variational inference for text processing. In International Conference on Machine Learning. 1727--1736. Google ScholarDigital Library
- Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality Advances in neural information processing systems. 3111--3119. Google ScholarDigital Library
- Xia Ning and George Karypis. 2011. Slim: Sparse linear methods for top-n recommender systems Data Mining (ICDM), 2011 IEEE 11th International Conference on. 497--506. Google ScholarDigital Library
- Rong Pan, Yunhong Zhou, Bin Cao, Nathan N. Liu, Rajan Lukose, Martin Scholz, and Qiang Yang. 2008. One-class collaborative filtering. In Data Mining, 2008. ICDM'08. Eighth IEEE International Conference on. 502--511. Google ScholarDigital Library
- Arkadiusz Paterek. 2007. Improving regularized singular value decomposition for collaborative filtering Proceedings of KDD cup and workshop, Vol. Vol. 2007. 5--8.Google Scholar
- Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian personalized ranking from implicit feedback Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence. 452--461. Google ScholarDigital Library
- Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wierstra. 2014. Stochastic Backpropagation and Approximate Inference in Deep Generative Models. In Proceedings of the 31st International Conference on Machine Learning. 1278--1286. Google ScholarDigital Library
- Ruslan Salakhutdinov and Andriy Mnih. 2008. Probabilistic matrix factorization. Advances in neural information processing systems (2008), 1257--1264. Google ScholarDigital Library
- Ruslan Salakhutdinov, Andriy Mnih, and Geoffrey Hinton. 2007. Restricted Boltzmann machines for collaborative filtering Proceedings of the 24th International Conference on Machine Learning. 791--798. Google ScholarDigital Library
- Suvash Sedhain, Aditya Krishna Menon, Scott Sanner, and Darius Braziunas. 2016. On the Effectiveness of Linear Models for One-Class Collaborative Filtering. AAAI. Google ScholarDigital Library
- Suvash Sedhain, Aditya Krishna Menon, Scott Sanner, and Lexing Xie. 2015. Autorec: Autoencoders meet collaborative filtering Proceedings of the 24th International Conference on World Wide Web. 111--112. Google ScholarDigital Library
- Elena Smirnova and Flavian Vasile. 2017. Contextual Sequence Modeling for Recommendation with Recurrent Neural Networks Proceedings of the 2nd Workshop on Deep Learning for Recommender Systems. Google ScholarDigital Library
- Nitish Srivastava, Geoffrey E. Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. Journal of machine learning research Vol. 15, 1 (2014), 1929--1958. Google ScholarDigital Library
- Harald Steck. 2015. Gaussian ranking by matrix factorization. In Proceedings of the 9th ACM Conference on Recommender Systems. ACM, 115--122. Google ScholarDigital Library
- Yong Kiam Tan, Xinxing Xu, and Yong Liu. 2016. Improved recurrent neural networks for session-based recommendations Proceedings of the 1st Workshop on Deep Learning for Recommender Systems. 17--22. Google ScholarDigital Library
- Naftali Tishby, Fernando Pereira, and William Bialek. 2000. The information bottleneck method. arXiv preprint physics/0004057 (2000).Google Scholar
- Aaron van den Oord, Sander Dieleman, and Benjamin Schrauwen. 2013. Deep content-based music recommendation. In Advances in Neural Information Processing Systems 26. 2643--2651. Google ScholarDigital Library
- Hao Wang, Naiyan Wang, and Dit-Yan Yeung. 2015. Collaborative deep learning for recommender systems Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1235--1244. Google ScholarDigital Library
- Markus Weimer, Alexandros Karatzoglou, Quoc V Le, and Alex J Smola. 2008. Cofi rank-maximum margin matrix factorization for collaborative ranking Advances in neural information processing systems. 1593--1600. Google ScholarDigital Library
- Jason Weston, Samy Bengio, and Nicolas Usunier. 2011. Wsabie: Scaling up to large vocabulary image annotation IJCAI, Vol. Vol. 11. 2764--2770. Google ScholarDigital Library
- Yao Wu, Christopher DuBois, Alice X. Zheng, and Martin Ester. 2016. Collaborative denoising auto-encoders for top-n recommender systems Proceedings of the Ninth ACM International Conference on Web Search and Data Mining. 153--162. Google ScholarDigital Library
- Puyang Xu, Asela Gunawardana, and Sanjeev Khudanpur. 2011. Efficient subsampling for training complex language models Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1128--1136. Google ScholarDigital Library
- Shuang-Hong Yang, Bo Long, Alexander J. Smola, Hongyuan Zha, and Zhaohui Zheng. 2011. Collaborative competitive filtering: learning recommender using context of user choice. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval. ACM, 295--304. Google ScholarDigital Library
- Yin Zheng, Bangsheng Tang, Wenkui Ding, and Hanning Zhou. 2016. A Neural Autoregressive Approach to Collaborative Filtering Proceedings of The 33rd International Conference on Machine Learning. 764--773. Google ScholarDigital Library
Index Terms
- Variational Autoencoders for Collaborative Filtering
Recommendations
Bayesian Inference via Variational Approximation for Collaborative Filtering
Variational approximation method finds wide applicability in approximating difficult-to-compute probability distributions, a problem that is especially important in Bayesian inference to estimate posterior distributions. Latent factor model is a ...
Stochastic-Expert Variational Autoencoder for Collaborative Filtering
WWW '22: Proceedings of the ACM Web Conference 2022Motivated by the recent successes of deep generative models used for collaborative filtering, we propose a novel framework of VAE for collaborative filtering using multiple experts and stochastic expert selection, which allows the model to learn a ...
Bilateral Variational Autoencoder for Collaborative Filtering
WSDM '21: Proceedings of the 14th ACM International Conference on Web Search and Data MiningPreference data is a form of dyadic data, with measurements associated with pairs of elements arising from two discrete sets of objects. These are users and items, as well as their interactions, e.g., ratings. We are interested in learning ...
Comments