skip to main content
10.1145/3178876.3186150acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article
Free Access

Variational Autoencoders for Collaborative Filtering

Published:23 April 2018Publication History

ABSTRACT

We extend variational autoencoders (VAEs) to collaborative filtering for implicit feedback. This non-linear probabilistic model enables us to go beyond the limited modeling capacity of linear factor models which still largely dominate collaborative filtering research.We introduce a generative model with multinomial likelihood and use Bayesian inference for parameter estimation. Despite widespread use in language modeling and economics, the multinomial likelihood receives less attention in the recommender systems literature. We introduce a different regularization parameter for the learning objective, which proves to be crucial for achieving competitive performance. Remarkably, there is an efficient way to tune the parameter using annealing. The resulting model and learning algorithm has information-theoretic connections to maximum entropy discrimination and the information bottleneck principle. Empirically, we show that the proposed approach significantly outperforms several state-of-the-art baselines, including two recently-proposed neural network approaches, on several real-world datasets. We also provide extended experiments comparing the multinomial likelihood with other commonly used likelihood functions in the latent factor collaborative filtering literature and show favorable results. Finally, we identify the pros and cons of employing a principled Bayesian inference approach and characterize settings where it provides the most significant improvements.

References

  1. Alexander Alemi, Ian Fischer, Joshua Dillon, and Kevin Murphy. 2017. Deep Variational Information Bottleneck. In 5th International Conference on Learning Representations.Google ScholarGoogle Scholar
  2. Amjad Almahairi, Kyle Kastner, Kyunghyun Cho, and Aaron Courville. 2015. Learning distributed representations from reviews for collaborative filtering Proceedings of the 9th ACM Conference on Recommender Systems. ACM, 147--154. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Thierry Bertin-Mahieux, Daniel P.W. Ellis, Brian Whitman, and Paul Lamere. 2011. The Million Song Dataset.. In ISMIR, Vol. Vol. 2. 10.Google ScholarGoogle Scholar
  4. David M. Blei, Alp Kucukelbir, and Jon D. McAuliffe. 2017. Variational Inference: A Review for Statisticians. J. Amer. Statist. Assoc. Vol. 112, 518 (2017), 859--877.Google ScholarGoogle ScholarCross RefCross Ref
  5. David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent dirichlet allocation. Journal of Machine Learning Research Vol. 3, Jan (2003), 993--1022. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Aleksandar Botev, Bowen Zheng, and David Barber. 2017. Complementary Sum Sampling for Likelihood Approximation in Large Scale Classification. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. 1030--1038.Google ScholarGoogle Scholar
  7. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz, and Samy Bengio. 2015. Generating sentences from a continuous space. arXiv preprint arXiv:1511.06349 (2015).Google ScholarGoogle Scholar
  8. Sotirios Chatzis, Panayiotis Christodoulou, and Andreas S. Andreou. 2017. Recurrent Latent Variable Networks for Session-Based Recommendation Proceedings of the 2nd Workshop on Deep Learning for Recommender Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep neural networks for youtube recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems. ACM, 191--198. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Carl Doersch. 2016. Tutorial on variational autoencoders. arXiv preprint arXiv:1606.05908 (2016).Google ScholarGoogle Scholar
  11. Kostadin Georgiev and Preslav Nakov. 2013. A non-IID Framework for Collaborative Filtering with Restricted Boltzmann Machines Proceedings of the 30th International Conference on Machine Learning. 1148--1156. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Samuel Gershman and Noah Goodman. 2014. Amortized inference in probabilistic reasoning. In Proceedings of the Cognitive Science Society, Vol. Vol. 36.Google ScholarGoogle Scholar
  13. Prem Gopalan, Jake M. Hofman, and David M. Blei. 2015. Scalable Recommendation with Hierarchical Poisson Factorization Uncertainty in Artificial Intelligence. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web. 173--182. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Balázs Hidasi and Alexandros Karatzoglou. 2017. Recurrent Neural Networks with Top-k Gains for Session-based Recommendations. arXiv preprint arXiv:1706.03847 (2017).Google ScholarGoogle Scholar
  16. Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2015. Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939 (2015).Google ScholarGoogle Scholar
  17. Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. 2017. β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework 5th International Conference on Learning Representations.Google ScholarGoogle Scholar
  18. Matthew D. Hoffman and Matthew J. Johnson. 2016. ELBO surgery: yet another way to carve up the variational evidence lower bound Workshop in Advances in Approximate Bayesian Inference, NIPS.Google ScholarGoogle Scholar
  19. Yifan Hu, Yehuda Koren, and Chris Volinsky. 2008. Collaborative filtering for implicit feedback datasets Data Mining, 2008. ICDM'08. Eighth IEEE International Conference on. 263--272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Tommi Jaakkola, Marina Meila, and Tony Jebara. 2000. Maximum entropy discrimination. In Advances in Neural Information Processing Systems. 470--476. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Kalervo J"arvelin and Jaana Kek"al"ainen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS) Vol. 20, 4 (2002), 422--446. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Michael I. Jordan, Zoubin Ghahramani, Tommi S. Jaakkola, and Lawrence K. Saul. 1999. An introduction to variational methods for graphical models. Machine learning Vol. 37, 2 (1999), 183--233. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Diederik Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google ScholarGoogle Scholar
  24. Diederik P. Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).Google ScholarGoogle Scholar
  25. Rahul G. Krishnan, Dawen Liang, and Matthew D. Hoffman. 2017. On the challenges of learning with inference networks on sparse, high-dimensional data. arXiv preprint arXiv:1710.06085 (2017).Google ScholarGoogle Scholar
  26. Mark Levy and Kris Jack. 2013. Efficient top-n recommendation by linear regression RecSys Large Scale Recommender Systems Workshop.Google ScholarGoogle Scholar
  27. Dawen Liang, Jaan Altosaar, Laurent Charlin, and David M. Blei. 2016. Factorization meets the item embedding: Regularizing matrix factorization with item co-occurrence. In Proceedings of the 10th ACM conference on recommender systems. 59--66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Dawen Liang, Minshu Zhan, and Daniel P.W. Ellis. 2015. Content-Aware Collaborative Music Recommendation Using Pre-trained Neural Networks. ISMIR. 295--301.Google ScholarGoogle Scholar
  29. Benjamin Marlin. 2004. Collaborative filtering: A machine learning perspective. University of Toronto.Google ScholarGoogle Scholar
  30. Daniel McFadden et almbox.. 1973. Conditional logit analysis of qualitative choice behavior. (1973), bibinfonumpages105--142 pages.Google ScholarGoogle Scholar
  31. Yishu Miao, Lei Yu, and Phil Blunsom. 2016. Neural variational inference for text processing. In International Conference on Machine Learning. 1727--1736. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality Advances in neural information processing systems. 3111--3119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Xia Ning and George Karypis. 2011. Slim: Sparse linear methods for top-n recommender systems Data Mining (ICDM), 2011 IEEE 11th International Conference on. 497--506. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Rong Pan, Yunhong Zhou, Bin Cao, Nathan N. Liu, Rajan Lukose, Martin Scholz, and Qiang Yang. 2008. One-class collaborative filtering. In Data Mining, 2008. ICDM'08. Eighth IEEE International Conference on. 502--511. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Arkadiusz Paterek. 2007. Improving regularized singular value decomposition for collaborative filtering Proceedings of KDD cup and workshop, Vol. Vol. 2007. 5--8.Google ScholarGoogle Scholar
  36. Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian personalized ranking from implicit feedback Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence. 452--461. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wierstra. 2014. Stochastic Backpropagation and Approximate Inference in Deep Generative Models. In Proceedings of the 31st International Conference on Machine Learning. 1278--1286. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Ruslan Salakhutdinov and Andriy Mnih. 2008. Probabilistic matrix factorization. Advances in neural information processing systems (2008), 1257--1264. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Ruslan Salakhutdinov, Andriy Mnih, and Geoffrey Hinton. 2007. Restricted Boltzmann machines for collaborative filtering Proceedings of the 24th International Conference on Machine Learning. 791--798. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Suvash Sedhain, Aditya Krishna Menon, Scott Sanner, and Darius Braziunas. 2016. On the Effectiveness of Linear Models for One-Class Collaborative Filtering. AAAI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Suvash Sedhain, Aditya Krishna Menon, Scott Sanner, and Lexing Xie. 2015. Autorec: Autoencoders meet collaborative filtering Proceedings of the 24th International Conference on World Wide Web. 111--112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Elena Smirnova and Flavian Vasile. 2017. Contextual Sequence Modeling for Recommendation with Recurrent Neural Networks Proceedings of the 2nd Workshop on Deep Learning for Recommender Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Nitish Srivastava, Geoffrey E. Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. Journal of machine learning research Vol. 15, 1 (2014), 1929--1958. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Harald Steck. 2015. Gaussian ranking by matrix factorization. In Proceedings of the 9th ACM Conference on Recommender Systems. ACM, 115--122. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Yong Kiam Tan, Xinxing Xu, and Yong Liu. 2016. Improved recurrent neural networks for session-based recommendations Proceedings of the 1st Workshop on Deep Learning for Recommender Systems. 17--22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Naftali Tishby, Fernando Pereira, and William Bialek. 2000. The information bottleneck method. arXiv preprint physics/0004057 (2000).Google ScholarGoogle Scholar
  47. Aaron van den Oord, Sander Dieleman, and Benjamin Schrauwen. 2013. Deep content-based music recommendation. In Advances in Neural Information Processing Systems 26. 2643--2651. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Hao Wang, Naiyan Wang, and Dit-Yan Yeung. 2015. Collaborative deep learning for recommender systems Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1235--1244. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Markus Weimer, Alexandros Karatzoglou, Quoc V Le, and Alex J Smola. 2008. Cofi rank-maximum margin matrix factorization for collaborative ranking Advances in neural information processing systems. 1593--1600. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Jason Weston, Samy Bengio, and Nicolas Usunier. 2011. Wsabie: Scaling up to large vocabulary image annotation IJCAI, Vol. Vol. 11. 2764--2770. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Yao Wu, Christopher DuBois, Alice X. Zheng, and Martin Ester. 2016. Collaborative denoising auto-encoders for top-n recommender systems Proceedings of the Ninth ACM International Conference on Web Search and Data Mining. 153--162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Puyang Xu, Asela Gunawardana, and Sanjeev Khudanpur. 2011. Efficient subsampling for training complex language models Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1128--1136. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Shuang-Hong Yang, Bo Long, Alexander J. Smola, Hongyuan Zha, and Zhaohui Zheng. 2011. Collaborative competitive filtering: learning recommender using context of user choice. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval. ACM, 295--304. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Yin Zheng, Bangsheng Tang, Wenkui Ding, and Hanning Zhou. 2016. A Neural Autoregressive Approach to Collaborative Filtering Proceedings of The 33rd International Conference on Machine Learning. 764--773. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Variational Autoencoders for Collaborative Filtering

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          WWW '18: Proceedings of the 2018 World Wide Web Conference
          April 2018
          2000 pages
          ISBN:9781450356398

          Copyright © 2018 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          International World Wide Web Conferences Steering Committee

          Republic and Canton of Geneva, Switzerland

          Publication History

          • Published: 23 April 2018

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          WWW '18 Paper Acceptance Rate170of1,155submissions,15%Overall Acceptance Rate1,899of8,196submissions,23%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format