research-article

Open Access

Self-supervised Learning for Large-scale Item Recommendations

Authors:
Tiansheng Yao

Google, Inc., Mountain View, CA, USA

Google, Inc., Mountain View, CA, USA
View Profile

,
Xinyang Yi

Google, Inc., Mountain View, CA, USA

Google, Inc., Mountain View, CA, USA
View Profile

,
Derek Zhiyuan Cheng

Google, Inc., Mountain View, CA, USA

Google, Inc., Mountain View, CA, USA
View Profile

,
Felix Yu

Google, Inc., New York City, NY, USA

Google, Inc., New York City, NY, USA
View Profile

,
Ting Chen

Google, Inc., Toronto, Canada

Google, Inc., Toronto, Canada
View Profile

,
Aditya Menon

Google, Inc., New York City, NY, USA

Google, Inc., New York City, NY, USA
View Profile

,
Lichan Hong

Google, Inc., Mountain View, CA, USA

Google, Inc., Mountain View, CA, USA
View Profile

,
Ed H. Chi

Google, Inc., Mountain View, CA, USA

Google, Inc., Mountain View, CA, USA
View Profile

,
Steve Tjoa

Google, Inc., Mountain View, CA, USA

Google, Inc., Mountain View, CA, USA
View Profile

,
Jieqi (Jay) Kang

Google, Inc., Mountain View, CA, USA

Google, Inc., Mountain View, CA, USA
View Profile

,
Evan Ettinger

Google, Inc., Mountain View, CA, USA

Google, Inc., Mountain View, CA, USA
View Profile

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge ManagementOctober 2021Pages 4321–4330https://doi.org/10.1145/3459637.3481952

Published:30 October 2021Publication History

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

Pages 4321–4330

ABSTRACT

Large scale recommender models find most relevant items from huge catalogs, and they play a critical role in modern search and recommendation systems. To model the input space with large-vocab categorical features, a typical recommender model learns a joint embedding space through neural networks for both queries and items from user feedback data. However, with millions to billions of items in the corpus, users tend to provide feedback for a very small set of them, causing a power-law distribution. This makes the feedback data for long-tail items extremely sparse.

Inspired by the recent success in self-supervised representation learning research in both computer vision and natural language understanding, we propose a multi-task self-supervised learning (SSL) framework for large-scale item recommendations. The framework is designed to tackle the label sparsity problem by learning better latent relationship of item features. Specifically, SSL improves item representation learning as well as serving as additional regularization to improve generalization. Furthermore, we propose a novel data augmentation method that utilizes feature correlations within the proposed framework.

We evaluate our framework using two real-world datasets with 500M and 1B training examples respectively. Our results demonstrate the effectiveness of SSL regularization and show its superior performance over the state-of-the-art regularization techniques. We also have already launched the proposed techniques to a web-scale commercial app-to-app recommendation system, with significant improvements top-tier business metrics demonstrated in A/B experiments on live traffic. Our online results also verify our hypothesis that our framework indeed improves model performance even more on slices that lack supervision.

Supplemental Material

cikm1888.mp4

mp4

37.2 MB

Download

References

Alex Beutel, Ed H. Chi, Zhiyuan Cheng, Hubert Pham, and John Anderson. [n.d.]. Beyond Globally Optimal: Focused Learning for Improved Recommendations. In WWW 2017. Google ScholarDigital Library
L. Beyer, X. Zhai, A. Oliver, and A. Kolesnikov. [n.d.]. S4L: Self-Supervised Semi-Supervised Learning. In ICCV 2019.Google Scholar
Wei-Cheng Chang, Felix X. Yu, Yin-Wen Chang, Yiming Yang, and Sanjiv Kumar. [n.d.]. Pre-training Tasks for Embedding-based Large-scale Retrieval. In ICLR 2020.Google Scholar
Tianqi Chen and Carlos Guestrin. [n.d.]. XGBoost: A Scalable Tree Boosting System. In KDD 2016. Google ScholarDigital Library
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey E. Hinton. 2020 a. A Simple Framework for Contrastive Learning of Visual Representations. https://arxiv.org/abs/2002.05709Google Scholar
Ting Chen, Simon Kornblith, Kevin Swersky, Mohammad Norouzi, and Geoffrey Hinton. 2020 b. Big Self-Supervised Models are Strong Semi-Supervised Learners. arXiv preprint arXiv:2006.10029 (2020).Google Scholar
Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Rohan Anil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, and Hemal Shah. [n.d.]. Wide & Deep Learning for Recommender Systems (DLRS 2016). Google ScholarDigital Library
Evangelia Christakopoulou and George Karypis. [n.d.]. Local Latent Space Models for Top-N Recommendation.Google Scholar
Edith Cohen and David D. Lewis. [n.d.]. Approximating Matrix Multiplication for Pattern Recognition Tasks. In SODA 1997. Google ScholarDigital Library
Paul Covington, Jay Adams, and Emre Sargin. [n.d.]. Deep Neural Networks for YouTube Recommendations. In RecSys 2016. Google ScholarDigital Library
Maurizio Ferrari Dacrema, Paolo Cremonesi, and Dietmar Jannach. [n.d.]. Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches. In RecSys 2019.Google ScholarDigital Library
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. [n.d.]. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT 2019.Google Scholar
John Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. J. Mach. Learn. Res., Vol. 12, null (July 2011), 2121--2159. Google ScholarDigital Library
Wikimedia Foundation. [n.d.]. Wikimedia. https://dumps.wikimedia.org/Google Scholar
Spyros Gidaris, Praveer Singh, and Nikos Komodakis. [n.d.]. Unsupervised Representation Learning by Predicting Image Rotations. In ICLR 2018.Google Scholar
Daniel Gillick, Alessandro Presta, and Gaurav Singh Tomar. 2018. End-to-End Retrieval in Continuous Space. CoRR, Vol. abs/1811.08008 (2018). http://arxiv.org/abs/1811.08008Google Scholar
Chuan Guo, Ali Mousavi, Xiang Wu, Daniel N Holtmann-Rice, Satyen Kale, Sashank Reddi, and Sanjiv Kumar. 2019. Breaking the Glass Ceiling for Embedding-Based Classifiers for Large Output Spaces. In Neurips,, H. Wallach, H. Larochelle, A. Beygelzimer, F. dtextquotesingle Alché-Buc, E. Fox, and R. Garnett (Eds.). Google ScholarDigital Library
Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. [n.d.]. Neural Collaborative Filtering. In WWW 2017.Google Scholar
Xinran He, Junfeng Pan, Ou Jin, Tianbing Xu, Bo Liu, Tao Xu, Yanxin Shi, Antoine Atallah, Ralf Herbrich, Stuart Bowers, and Joaquin Qui nonero Candela. 2014. Practical Lessons from Predicting Clicks on Ads at Facebook. In Proceedings of the Eighth International Workshop on Data Mining for Online Advertising. Google ScholarDigital Library
Alexander Kolesnikov, Xiaohua Zhai, and Lucas Beyer. [n.d.]. Revisiting Self-Supervised Visual Representation Learning. In CVPR 2019.Google Scholar
Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix Factorization Techniques for Recommender Systems. Computer, Vol. 42, 8 (Aug. 2009), 30--37. Google ScholarDigital Library
Yehuda Koren and Robert M. Bell. 2015. Advances in Collaborative Filtering. Springer, 77--118.Google Scholar
Walid Krichene, Nicolas Mayoraz, Steffen Rendle, Li Zhang, Xinyang Yi, Lichan Hong, Ed Chi, and John Anderson. [n.d.]. Efficient Training on Very Large Corpora via Gramian Estimation. In ICLR 2019.Google Scholar
Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. [n.d.]. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. In ICLR 2020.Google Scholar
Gustav Larsson, Michael Maire, and Gregory Shakhnarovich. [n.d.]. Learning Representations for Automatic Colorization. In ECCV 2016.Google Scholar
David C. Liu, Stephanie Rogers, Raymond Shiau, Dmitry Kislyuk, Kevin C. Ma, Zhigang Zhong, Jenny Liu, and Yushi Jing. [n.d.]. Related Pins at Pinterest: The Evolution of a Real-World Recommender System. In WWW 2017. Google ScholarDigital Library
Jianxin Ma, Chang Zhou, Hongxia Yang, Peng Cui, Xin Wang, and Wenwu Zhu. [n.d.]. Disentangled Self-Supervision in Sequential Recommenders. In KDD 2020.Google Scholar
Klaas Bosteels Mark Levy. [n.d.]. Music Recommendation and the Long Tail. In 1st Workshop On Music Recommendation And Discovery (WOMRAD), ACM RecSys, 2010.Google Scholar
Rishabh Mehrotra, Mounia Lalmas, Doug Kenney, Thomas Lim-Meng, and Golli Hashemian. [n.d.]. Jointly Leveraging Intent and Interaction Signals to Predict User Satisfaction with Slate Recommendations. In WWW 2019. Google ScholarDigital Library
Stavsa Milojević. 2010. Power Law Distributions in Information Science: Making the Case for Logarithmic Binning. J. Am. Soc. Inf. Sci. Technol., Vol. 61, 12 (Dec. 2010), 2417--2425. Google ScholarDigital Library
Maxim Naumov, Dheevatsa Mudigere, Hao-Jun Michael Shi, Jianyu Huang, Narayanan Sundaraman, Jongsoo Park, Xiaodong Wang, Udit Gupta, Carole-Jean Wu, Alisson G. Azzolini, Dmytro Dzhulgakov, Andrey Mallevich, Ilia Cherniavskii, Yinghai Lu, Raghuraman Krishnamoorthi, Ansha Yu, Volodymyr Kondratenko, Stephanie Pereira, Xianjie Chen, Wenlin Chen, Vijay Rao, Bill Jia, Liang Xiong, and Misha Smelyanskiy. 2019. Deep Learning Recommendation Model for Personalization and Recommendation Systems. CoRR, Vol. abs/1906.00091 (2019).Google Scholar
Wei Niu, James Caverlee, and Haokai Lu. [n.d.]. Neural Personalized Ranking for Image Recommendation. In WSDM 2018. Google ScholarDigital Library
Mehdi Noroozi and Paolo Favaro. [n.d.]. Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles. In ECCV 2016.Google ScholarCross Ref
Shumpei Okura, Yukihiro Tagami, Shingo Ono, and Akira Tajima. [n.d.]. Embedding-Based News Recommendation for Millions of Users. In KDD 2017. Google ScholarDigital Library
Maksims Volkovs, Guangwei Yu, and Tomi Poutanen. [n.d.]. DropoutNet: Addressing Cold Start in Recommender Systems. In Neurips 2017. Google ScholarDigital Library
Zhirong Wu, Yuanjun Xiong, Stella Yu, and Dahua Lin. 2018. Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination. CoRR, Vol. abs/1805.01978 (2018). http://arxiv.org/abs/1805.01978Google Scholar
Xin Xin, Alexandros Karatzoglou, I. Arapakis, and J. Jose. [n.d.]. Self-Supervised Reinforcement Learning for Recommender Systems. SIGIR 2020 ([n.,d.]). Google ScholarDigital Library
Yinfei Yang, Steve Yuan, Daniel Cer, Sheng-yi Kong, Noah Constant, Petr Pilar, Heming Ge, Yun-Hsuan Sung, Brian Strope, and Ray Kurzweil. 2018. Learning Semantic Textual Similarity from Conversations. In Proceedings of The Third Workshop on Representation Learning for NLP. ACL, 164--174.Google ScholarCross Ref
Xinyang Yi, Ji Yang, Lichan Hong, Derek Zhiyuan Cheng, Lukasz Heldt, Aditee Kumthekar, Zhe Zhao, Li Wei, and Ed Chi. [n.d.]. Sampling-Bias-Corrected Neural Modeling for Large Corpus Item Recommendations. In RecSys 2019. Google ScholarDigital Library
Andrew Zhai, Dmitry Kislyuk, Yushi Jing, Michael Feng, Eric Tzeng, Jeff Donahue, Yue Li Du, and Trevor Darrell. [n.d.]. Visual Discovery at Pinterest. In WWW 2017. Google ScholarDigital Library
Xu Zhang, Felix X. Yu, Sanjiv Kumar, and Shih-Fu Chang. [n.d.]. Learning Spread-Out Local Feature Descriptors. In ICCV 2017.Google Scholar
Zhe Zhao, Lichan Hong, Li Wei, Jilin Chen, Aniruddh Nath, Shawn Andrews, Aditee Kumthekar, Maheswaran Sathiamoorthy, Xinyang Yi, and Ed Chi. [n.d.]. Recommending What Video to Watch next: A Multitask Ranking System. In RecSys 2019. Google ScholarDigital Library
Kun Zhou, Haibo Wang, Wayne Xin Zhao, Yutao Zhu, Sirui Wang, Fuzheng Zhang, Zhong yuan Wang, and Jirong Wen. [n.d.]. S3-Rec: Self-Supervised Learning for Sequential Recommendation with Mutual Information Maximization. CIKM 2020 ([n.,d.]).Google Scholar

Index Terms

Self-supervised Learning for Large-scale Item Recommendations
1. Information systems
  1. Information retrieval

Recommendations

Multi-view Contrastive Learning Network for Recommendation
Pattern Recognition and Computer Vision
Abstract
Knowledge graphs (KGs) are being introduced into recommender systems in more and more scenarios. However, the supervised signals of the existing KG-aware recommendation models only come from the historical interactions between users and items, ...
Read More
Item cold-start recommendations: learning local collective embeddings
RecSys '14: Proceedings of the 8th ACM Conference on Recommender systems

Recommender systems suggest to users items that they might like (e.g., news articles, songs, movies) and, in doing so, they help users deal with information overload and enjoy a personalized experience. One of the main problems of these systems is the ...
Read More
Contrastive Collaborative Filtering for Cold-Start Item Recommendation
WWW '23: Proceedings of the ACM Web Conference 2023

The cold-start problem is a long-standing challenge in recommender systems. As a promising solution, content-based generative models usually project a cold-start item’s content onto a warm-start item embedding to capture collaborative signals from item ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
October 2021
4966 pages
ISBN:9781450384469
DOI:10.1145/3459637
General Chairs:
Gianluca Demartini
The University of Queensland, Australia
,
Guido Zuccon
The University of Queensland, Australia
,
Program Chairs:
J. Shane Culpepper
RMIT University, Australia
,
Zi Huang
The University of Queensland, Australia
,
Hanghang Tong
University of Illinois at Urbana-Champaign, USA
Copyright © 2021 Owner/Author
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 30 October 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
contrastive learning
neural networks
recommender systems
self-supervised learning
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,861of8,427submissions,22%
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 52
  Total Citations
  View Citations
- 5,477
  Total Downloads
- Downloads (Last 12 months)1,944
- Downloads (Last 6 weeks)234
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Self-supervised Learning for Large-scale Item Recommendations

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Multi-view Contrastive Learning Network for Recommendation

Item cold-start recommendations: learning local collective embeddings

Contrastive Collaborative Filtering for Cold-Start Item Recommendation