nach oben

Knowledge and Information Systems

Erschienen in:

20.04.2022 | Regular paper

Knowledge distillation meets recommendation: collaborative distillation for top-N recommendation

verfasst von: Jae-woong Lee, Minjin Choi, Lee Sael, Hyunjung Shim, Jongwuk Lee

Erschienen in: Knowledge and Information Systems | Ausgabe 5/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Knowledge distillation (KD) is a successful method for transferring knowledge from one model (i.e., teacher model) to another model (i.e., student model). Despite the success of KD in classification tasks, applying KD to recommender models is challenging because of the sparsity of positive feedback, ambiguity of missing feedback, and ranking problem for top-N recommendation. In this paper, we propose a new KD model for collaborative filtering, namely collaborative distillation (CD). Specifically, (1) we reformulate a loss function to deal with the ambiguity of missing feedback. (2) We exploit probabilistic rank-aware sampling for top-N recommendation. (3) To train the proposed model effectively, we develop two training strategies for the student model, called teacher- and student-guided training methods, adaptively selecting the most beneficial feedback from the teacher model. Furthermore, we extend our model using self-distillation, called born-again CD (BACD). That is, the teacher and student models with the same model capacity are trained by using the proposed distillation method. The experimental results demonstrate that CD outperforms the state-of-the-art method by 2.7–33.2% and 2.7–29.9% in hit rate (HR) and normalized discounted cumulative gain (NDCG), respectively. Moreover, BACD improves the teacher model by 3.5–12.0% and 4.9–13.3% in HR and NDCG, respectively.

Vorheriger Artikel Span-based relational graph transformer network for aspect–opinion pair extraction

Nächster Artikel Modeling and predicting students’ engagement behaviors using mixture Markov models

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

http://jmcauley.ucsd.edu/data/amazon/.

https://grouplens.org/datasets/movielens/.

https://github.com/hexiangnan/sigir16-eals.

http://dawenl.github.io/data/gowalla_pro.zip.

https://github.com/graytowne/rank_distill.

https://github.com/graytowne/caser_pytorch.

Bagherinezhad H, Horton M, Rastegari M, Farhadi A (2018) Label refinery: improving imagenet classification through label progression. CoRR arxiv:1805.02641

Chen T, Goodfellow IJ, Shlens J (2016) Net2net: accelerating learning via knowledge transfer. In: International conference on learning representations (ICLR)

Furlanello T, Lipton ZC, Tschannen M, Itti L, Anandkumar A (2018) Born-again neural networks. In: Proceedings of the 35th international conference on machine learning (ICML), pp 1602–1611

He X, Du X, Wang X, Tian F, Tang J, Chua T (2018) Outer product-based neural collaborative filtering. In: International joint conference on artificial intelligence (IJCAI), pp 2227–2233

He X, Liao L, Zhang H, Nie L, Hu X, Chua T (2017) Neural collaborative filtering. In: International conference on world wide Web (WWW), pp 173–182

Heo B, Kim J, Yun S, Park H, Kwak N, Choi JY (2019) A comprehensive overhaul of feature distillation. CoRR arxiv:1904.01866

Heo B, Lee M, Yun S, Choi JY (2019) Knowledge transfer via distillation of activation boundaries formed by hidden neurons. In: AAAI conference on artificial intelligence (AAAI), pp 3779–3787

Hinton GE, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. CoRR arxiv:1503.02531

Hu Y, Koren Y, Volinsky C (2008) Collaborative filtering for implicit feedback datasets. In: IEEE international conference on data mining (ICDM), pp 263–272

10.

Kim DH, Park C, Oh J, Lee S, Yu H (2016) Convolutional matrix factorization for document context-aware recommendation. In: ACM conference on recommender systems (RecSys), pp 233–240

11.

Li C, Xu T, Zhu J, Zhang B (2017) Triple generative adversarial nets. In: Advances in neural information processing systems (NeurIPS), pp 4088–4098

12.

Li Y, Hu J, Zhai C, Chen Y (2010) Improving one-class collaborative filtering by incorporating rich user information. In: ACM international conference on information and knowledge management, (CIKM), pp 959–968

13.

Niu W, Caverlee J, Lu H (2018) Neural personalized ranking for image recommendation. In: ACM international conference on web search and data mining (WSDM), pp 423–431

14.

Pan R, Scholz M (2009) Mind the gaps: weighting the unknown in large-scale one-class collaborative filtering. In: ACM international conference on knowledge discovery and data mining (KDD), pp 667–676

15.

Pan R, Zhou Y, Cao B, Liu NN, Lukose RM, Scholz M, Yang Q (2008) One-class collaborative filtering. In: IEEE international conference on data mining (ICDM), pp 502–511

16.

Paquet U, Koenigstein N (2013) One-class collaborative filtering with random graphs. In: International world wide web conference (WWW), pp 999–1008

17.

Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y (2015) Fitnets: hints for thin deep nets. In: International conference on learning representations (ICLR)

18.

Sedhain S, Menon AK, Sanner S, Xie L (2015) Autorec: autoencoders meet collaborative filtering. In: International conference on world wide web companion (WWW), pp 111–112

19.

Sindhwani V, Bucak SS, Hu J, Mojsilovic A (2010) One-class matrix completion with low-density factorizations. In: IEEE international conference on data mining (ICDM), pp 1055–1060

20.

Srinivas S, Fleuret F (2018) Knowledge transfer with jacobian matching. In: International conference on machine learning (ICML), pp 4730–4738

21.

Tang J, Wang K (2018) Personalized top-n sequential recommendation via convolutional sequence embedding. In: ACM international conference on web search and data mining (WSDM), pp 565–573

22.

Tang J, Wang K (2018) Ranking distillation: Learning compact ranking models with high performance for recommender system. In: ACM SIGKDD international conference on knowledge discovery & data mining (KDD), pp 2289–2298

23.

Wang H, Wang N, Yeung D (2015) Collaborative deep learning for recommender systems. In: ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 1235–1244

24.

Wang X, Zhang R, Sun Y, Qi J (2018) KDGAN: knowledge distillation with generative adversarial networks. In: Annual conference on neural information processing systems (NeurIPS), pp 783–794

25.

Wang Y, Xu C, Xu C, Tao D (2018) Adversarial learning of portable student networks. In: AAAI conference on artificial intelligence (AAAI), pp 4260–4267

26.

Wu Y, DuBois C, Zheng AX, Ester M (2016) Collaborative denoising auto-encoders for top-n recommender systems. In: ACM international conference on web search and data mining (WSDM), pp 153–162

27.

Xu Z, Hsu Y, Huang J (2018) Training shallow and thin networks for acceleration via knowledge distillation with conditional adversarial networks. In: International conference on learning representations (ICLR)

28.

Xue H, Dai X, Zhang J, Huang S, Chen J (2017) Deep matrix factorization models for recommender systems. In: International joint conference on artificial intelligence (IJCAI), pp 3203–3209

29.

Yang C, Xie L, Qiao S, Yuille AL (2019) Training deep neural networks in generations: a more tolerant teacher educates better students. In: The thirty-third AAAI conference on artificial intelligence (AAAI), pp 5628–5635

30.

Yao Y, Tong H, Yan G, Xu F, Zhang X, Szymanski BK, Lu J (2014) Dual-regularized one-class collaborative filtering. In: ACM international conference on conference on information and knowledge management (CIKM), pp 759–768

31.

Yim J, Joo D, Bae J, Kim J (2017) A gift from knowledge distillation: Fast optimization, network minimization and transfer learning. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 7130–7138

32.

Zagoruyko S, Komodakis N (2017) Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. In: International conference on learning representations (ICLR)

33.

Zheng X, Ding H, Mamitsuka H, Zhu S (2013) Collaborative matrix factorization with multiple similarities for predicting drug-target interactions. In: ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 1025–1033

Titel: Knowledge distillation meets recommendation: collaborative distillation for top-N recommendation
verfasst von: Jae-woong Lee
Minjin Choi
Lee Sael
Hyunjung Shim
Jongwuk Lee
Publikationsdatum: 20.04.2022
Verlag: Springer London
Erschienen in: Knowledge and Information Systems / Ausgabe 5/2022
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI: https://doi.org/10.1007/s10115-022-01667-8

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 5/2022

Modeling and predicting students’ engagement behaviors using mixture Markov models

A survey on extraction of causal relations from natural language text

How do I update my model? On the resilience of Predictive Process Monitoring models to change

Multimedia ontology population through semantic analysis and hierarchical deep features extraction techniques

Span-based relational graph transformer network for aspect–opinion pair extraction

Semantic-driven multi-camera pedestrian detection