Variational Autoencoders for Collaborative Filtering

Authors:
Dawen Liang

Netflix, Los Gatos, CA, USA

Netflix, Los Gatos, CA, USA
View Profile

,
Rahul G. Krishnan

Massachusetts Institute of Technology, Cambridge, MA, USA

Massachusetts Institute of Technology, Cambridge, MA, USA
View Profile

,
Matthew D. Hoffman

Google AI, San Francisco, CA, USA

Google AI, San Francisco, CA, USA
View Profile

,
Tony Jebara

Netflix, Los Gatos, CA, USA

Netflix, Los Gatos, CA, USA
View Profile

WWW '18: Proceedings of the 2018 World Wide Web ConferenceApril 2018Pages 689–698https://doi.org/10.1145/3178876.3186150

Published:23 April 2018Publication History

WWW '18: Proceedings of the 2018 World Wide Web Conference

Pages 689–698

ABSTRACT

We extend variational autoencoders (VAEs) to collaborative filtering for implicit feedback. This non-linear probabilistic model enables us to go beyond the limited modeling capacity of linear factor models which still largely dominate collaborative filtering research.We introduce a generative model with multinomial likelihood and use Bayesian inference for parameter estimation. Despite widespread use in language modeling and economics, the multinomial likelihood receives less attention in the recommender systems literature. We introduce a different regularization parameter for the learning objective, which proves to be crucial for achieving competitive performance. Remarkably, there is an efficient way to tune the parameter using annealing. The resulting model and learning algorithm has information-theoretic connections to maximum entropy discrimination and the information bottleneck principle. Empirically, we show that the proposed approach significantly outperforms several state-of-the-art baselines, including two recently-proposed neural network approaches, on several real-world datasets. We also provide extended experiments comparing the multinomial likelihood with other commonly used likelihood functions in the latent factor collaborative filtering literature and show favorable results. Finally, we identify the pros and cons of employing a principled Bayesian inference approach and characterize settings where it provides the most significant improvements.

References

Alexander Alemi, Ian Fischer, Joshua Dillon, and Kevin Murphy. 2017. Deep Variational Information Bottleneck. In 5th International Conference on Learning Representations.Google Scholar
Amjad Almahairi, Kyle Kastner, Kyunghyun Cho, and Aaron Courville. 2015. Learning distributed representations from reviews for collaborative filtering Proceedings of the 9th ACM Conference on Recommender Systems. ACM, 147--154. Google ScholarDigital Library
Thierry Bertin-Mahieux, Daniel P.W. Ellis, Brian Whitman, and Paul Lamere. 2011. The Million Song Dataset.. In ISMIR, Vol. Vol. 2. 10.Google Scholar
David M. Blei, Alp Kucukelbir, and Jon D. McAuliffe. 2017. Variational Inference: A Review for Statisticians. J. Amer. Statist. Assoc. Vol. 112, 518 (2017), 859--877.Google ScholarCross Ref
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent dirichlet allocation. Journal of Machine Learning Research Vol. 3, Jan (2003), 993--1022. Google ScholarDigital Library
Aleksandar Botev, Bowen Zheng, and David Barber. 2017. Complementary Sum Sampling for Likelihood Approximation in Large Scale Classification. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. 1030--1038.Google Scholar
Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz, and Samy Bengio. 2015. Generating sentences from a continuous space. arXiv preprint arXiv:1511.06349 (2015).Google Scholar
Sotirios Chatzis, Panayiotis Christodoulou, and Andreas S. Andreou. 2017. Recurrent Latent Variable Networks for Session-Based Recommendation Proceedings of the 2nd Workshop on Deep Learning for Recommender Systems. Google ScholarDigital Library
Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep neural networks for youtube recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems. ACM, 191--198. Google ScholarDigital Library
Carl Doersch. 2016. Tutorial on variational autoencoders. arXiv preprint arXiv:1606.05908 (2016).Google Scholar
Kostadin Georgiev and Preslav Nakov. 2013. A non-IID Framework for Collaborative Filtering with Restricted Boltzmann Machines Proceedings of the 30th International Conference on Machine Learning. 1148--1156. Google ScholarDigital Library
Samuel Gershman and Noah Goodman. 2014. Amortized inference in probabilistic reasoning. In Proceedings of the Cognitive Science Society, Vol. Vol. 36.Google Scholar
Prem Gopalan, Jake M. Hofman, and David M. Blei. 2015. Scalable Recommendation with Hierarchical Poisson Factorization Uncertainty in Artificial Intelligence. Google ScholarDigital Library
Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web. 173--182. Google ScholarDigital Library
Balázs Hidasi and Alexandros Karatzoglou. 2017. Recurrent Neural Networks with Top-k Gains for Session-based Recommendations. arXiv preprint arXiv:1706.03847 (2017).Google Scholar
Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2015. Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939 (2015).Google Scholar
Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. 2017. β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework 5th International Conference on Learning Representations.Google Scholar
Matthew D. Hoffman and Matthew J. Johnson. 2016. ELBO surgery: yet another way to carve up the variational evidence lower bound Workshop in Advances in Approximate Bayesian Inference, NIPS.Google Scholar
Yifan Hu, Yehuda Koren, and Chris Volinsky. 2008. Collaborative filtering for implicit feedback datasets Data Mining, 2008. ICDM'08. Eighth IEEE International Conference on. 263--272. Google ScholarDigital Library
Tommi Jaakkola, Marina Meila, and Tony Jebara. 2000. Maximum entropy discrimination. In Advances in Neural Information Processing Systems. 470--476. Google ScholarDigital Library
Kalervo J"arvelin and Jaana Kek"al"ainen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS) Vol. 20, 4 (2002), 422--446. Google ScholarDigital Library
Michael I. Jordan, Zoubin Ghahramani, Tommi S. Jaakkola, and Lawrence K. Saul. 1999. An introduction to variational methods for graphical models. Machine learning Vol. 37, 2 (1999), 183--233. Google ScholarDigital Library
Diederik Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
Diederik P. Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).Google Scholar
Rahul G. Krishnan, Dawen Liang, and Matthew D. Hoffman. 2017. On the challenges of learning with inference networks on sparse, high-dimensional data. arXiv preprint arXiv:1710.06085 (2017).Google Scholar
Mark Levy and Kris Jack. 2013. Efficient top-n recommendation by linear regression RecSys Large Scale Recommender Systems Workshop.Google Scholar
Dawen Liang, Jaan Altosaar, Laurent Charlin, and David M. Blei. 2016. Factorization meets the item embedding: Regularizing matrix factorization with item co-occurrence. In Proceedings of the 10th ACM conference on recommender systems. 59--66. Google ScholarDigital Library
Dawen Liang, Minshu Zhan, and Daniel P.W. Ellis. 2015. Content-Aware Collaborative Music Recommendation Using Pre-trained Neural Networks. ISMIR. 295--301.Google Scholar
Benjamin Marlin. 2004. Collaborative filtering: A machine learning perspective. University of Toronto.Google Scholar
Daniel McFadden et almbox.. 1973. Conditional logit analysis of qualitative choice behavior. (1973), bibinfonumpages105--142 pages.Google Scholar
Yishu Miao, Lei Yu, and Phil Blunsom. 2016. Neural variational inference for text processing. In International Conference on Machine Learning. 1727--1736. Google ScholarDigital Library
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality Advances in neural information processing systems. 3111--3119. Google ScholarDigital Library
Xia Ning and George Karypis. 2011. Slim: Sparse linear methods for top-n recommender systems Data Mining (ICDM), 2011 IEEE 11th International Conference on. 497--506. Google ScholarDigital Library
Rong Pan, Yunhong Zhou, Bin Cao, Nathan N. Liu, Rajan Lukose, Martin Scholz, and Qiang Yang. 2008. One-class collaborative filtering. In Data Mining, 2008. ICDM'08. Eighth IEEE International Conference on. 502--511. Google ScholarDigital Library
Arkadiusz Paterek. 2007. Improving regularized singular value decomposition for collaborative filtering Proceedings of KDD cup and workshop, Vol. Vol. 2007. 5--8.Google Scholar
Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian personalized ranking from implicit feedback Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence. 452--461. Google ScholarDigital Library
Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wierstra. 2014. Stochastic Backpropagation and Approximate Inference in Deep Generative Models. In Proceedings of the 31st International Conference on Machine Learning. 1278--1286. Google ScholarDigital Library
Ruslan Salakhutdinov and Andriy Mnih. 2008. Probabilistic matrix factorization. Advances in neural information processing systems (2008), 1257--1264. Google ScholarDigital Library
Ruslan Salakhutdinov, Andriy Mnih, and Geoffrey Hinton. 2007. Restricted Boltzmann machines for collaborative filtering Proceedings of the 24th International Conference on Machine Learning. 791--798. Google ScholarDigital Library
Suvash Sedhain, Aditya Krishna Menon, Scott Sanner, and Darius Braziunas. 2016. On the Effectiveness of Linear Models for One-Class Collaborative Filtering. AAAI. Google ScholarDigital Library
Suvash Sedhain, Aditya Krishna Menon, Scott Sanner, and Lexing Xie. 2015. Autorec: Autoencoders meet collaborative filtering Proceedings of the 24th International Conference on World Wide Web. 111--112. Google ScholarDigital Library
Elena Smirnova and Flavian Vasile. 2017. Contextual Sequence Modeling for Recommendation with Recurrent Neural Networks Proceedings of the 2nd Workshop on Deep Learning for Recommender Systems. Google ScholarDigital Library
Nitish Srivastava, Geoffrey E. Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. Journal of machine learning research Vol. 15, 1 (2014), 1929--1958. Google ScholarDigital Library
Harald Steck. 2015. Gaussian ranking by matrix factorization. In Proceedings of the 9th ACM Conference on Recommender Systems. ACM, 115--122. Google ScholarDigital Library
Yong Kiam Tan, Xinxing Xu, and Yong Liu. 2016. Improved recurrent neural networks for session-based recommendations Proceedings of the 1st Workshop on Deep Learning for Recommender Systems. 17--22. Google ScholarDigital Library
Naftali Tishby, Fernando Pereira, and William Bialek. 2000. The information bottleneck method. arXiv preprint physics/0004057 (2000).Google Scholar
Aaron van den Oord, Sander Dieleman, and Benjamin Schrauwen. 2013. Deep content-based music recommendation. In Advances in Neural Information Processing Systems 26. 2643--2651. Google ScholarDigital Library
Hao Wang, Naiyan Wang, and Dit-Yan Yeung. 2015. Collaborative deep learning for recommender systems Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1235--1244. Google ScholarDigital Library
Markus Weimer, Alexandros Karatzoglou, Quoc V Le, and Alex J Smola. 2008. Cofi rank-maximum margin matrix factorization for collaborative ranking Advances in neural information processing systems. 1593--1600. Google ScholarDigital Library
Jason Weston, Samy Bengio, and Nicolas Usunier. 2011. Wsabie: Scaling up to large vocabulary image annotation IJCAI, Vol. Vol. 11. 2764--2770. Google ScholarDigital Library
Yao Wu, Christopher DuBois, Alice X. Zheng, and Martin Ester. 2016. Collaborative denoising auto-encoders for top-n recommender systems Proceedings of the Ninth ACM International Conference on Web Search and Data Mining. 153--162. Google ScholarDigital Library
Puyang Xu, Asela Gunawardana, and Sanjeev Khudanpur. 2011. Efficient subsampling for training complex language models Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1128--1136. Google ScholarDigital Library
Shuang-Hong Yang, Bo Long, Alexander J. Smola, Hongyuan Zha, and Zhaohui Zheng. 2011. Collaborative competitive filtering: learning recommender using context of user choice. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval. ACM, 295--304. Google ScholarDigital Library
Yin Zheng, Bangsheng Tang, Wenkui Ding, and Hanning Zhou. 2016. A Neural Autoregressive Approach to Collaborative Filtering Proceedings of The 33rd International Conference on Machine Learning. 764--773. Google ScholarDigital Library

Index Terms

Variational Autoencoders for Collaborative Filtering
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Learning in probabilistic graphical models
        Latent variable models
2. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Recommender systems
  2. Information systems applications
    1. Data mining
      1. Collaborative filtering

Recommendations

Bayesian Inference via Variational Approximation for Collaborative Filtering

Variational approximation method finds wide applicability in approximating difficult-to-compute probability distributions, a problem that is especially important in Bayesian inference to estimate posterior distributions. Latent factor model is a ...
Read More
Stochastic-Expert Variational Autoencoder for Collaborative Filtering
WWW '22: Proceedings of the ACM Web Conference 2022

Motivated by the recent successes of deep generative models used for collaborative filtering, we propose a novel framework of VAE for collaborative filtering using multiple experts and stochastic expert selection, which allows the model to learn a ...
Read More
Bilateral Variational Autoencoder for Collaborative Filtering
WSDM '21: Proceedings of the 14th ACM International Conference on Web Search and Data Mining

Preference data is a form of dyadic data, with measurements associated with pairs of elements arising from two discrete sets of objects. These are users and items, as well as their interactions, e.g., ratings. We are interested in learning ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '18: Proceedings of the 2018 World Wide Web Conference
April 2018
2000 pages
ISBN:9781450356398
General Chairs:
Pierre-Antoine Champin
Universitè Claude Bernard Lyon 1, France
,
Fabien Gandon
Inria, Université Côte d'Azur, CNRS, I3S, France
,
Lionel Médini
Université Claude Bernard Lyon 1, France
,
Program Chairs:
Mounia Lalmas
Spotify, UK
,
Panagiotis G. Ipeirotis
New York University, USA
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
International World Wide Web Conferences Steering Committee
Republic and Canton of Geneva, Switzerland
Publication History
- Published: 23 April 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
bayesian models
collaborative filtering
implicit feedback
recommender systems
variational autoencoder
Qualifiers
- research-article
Conference

Acceptance Rates
WWW '18 Paper Acceptance Rate170of1,155submissions,15%Overall Acceptance Rate1,899of8,196submissions,23%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 662
  Total Citations
  View Citations
- 9,689
  Total Downloads
- Downloads (Last 12 months)2,616
- Downloads (Last 6 weeks)368
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Variational Autoencoders for Collaborative Filtering

WWW '18: Proceedings of the 2018 World Wide Web Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Bayesian Inference via Variational Approximation for Collaborative Filtering

Stochastic-Expert Variational Autoencoder for Collaborative Filtering

Bilateral Variational Autoencoder for Collaborative Filtering