Skip to main content
Erschienen in: Knowledge and Information Systems 5/2022

20.04.2022 | Regular paper

Knowledge distillation meets recommendation: collaborative distillation for top-N recommendation

verfasst von: Jae-woong Lee, Minjin Choi, Lee Sael, Hyunjung Shim, Jongwuk Lee

Erschienen in: Knowledge and Information Systems | Ausgabe 5/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Knowledge distillation (KD) is a successful method for transferring knowledge from one model (i.e., teacher model) to another model (i.e., student model). Despite the success of KD in classification tasks, applying KD to recommender models is challenging because of the sparsity of positive feedback, ambiguity of missing feedback, and ranking problem for top-N recommendation. In this paper, we propose a new KD model for collaborative filtering, namely collaborative distillation (CD). Specifically, (1) we reformulate a loss function to deal with the ambiguity of missing feedback. (2) We exploit probabilistic rank-aware sampling for top-N recommendation. (3) To train the proposed model effectively, we develop two training strategies for the student model, called teacher- and student-guided training methods, adaptively selecting the most beneficial feedback from the teacher model. Furthermore, we extend our model using self-distillation, called born-again CD (BACD). That is, the teacher and student models with the same model capacity are trained by using the proposed distillation method. The experimental results demonstrate that CD outperforms the state-of-the-art method by 2.7–33.2% and 2.7–29.9% in hit rate (HR) and normalized discounted cumulative gain (NDCG), respectively. Moreover, BACD improves the teacher model by 3.5–12.0% and 4.9–13.3% in HR and NDCG, respectively.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bagherinezhad H, Horton M, Rastegari M, Farhadi A (2018) Label refinery: improving imagenet classification through label progression. CoRR arxiv:1805.02641 Bagherinezhad H, Horton M, Rastegari M, Farhadi A (2018) Label refinery: improving imagenet classification through label progression. CoRR arxiv:​1805.​02641
2.
Zurück zum Zitat Chen T, Goodfellow IJ, Shlens J (2016) Net2net: accelerating learning via knowledge transfer. In: International conference on learning representations (ICLR) Chen T, Goodfellow IJ, Shlens J (2016) Net2net: accelerating learning via knowledge transfer. In: International conference on learning representations (ICLR)
3.
Zurück zum Zitat Furlanello T, Lipton ZC, Tschannen M, Itti L, Anandkumar A (2018) Born-again neural networks. In: Proceedings of the 35th international conference on machine learning (ICML), pp 1602–1611 Furlanello T, Lipton ZC, Tschannen M, Itti L, Anandkumar A (2018) Born-again neural networks. In: Proceedings of the 35th international conference on machine learning (ICML), pp 1602–1611
4.
Zurück zum Zitat He X, Du X, Wang X, Tian F, Tang J, Chua T (2018) Outer product-based neural collaborative filtering. In: International joint conference on artificial intelligence (IJCAI), pp 2227–2233 He X, Du X, Wang X, Tian F, Tang J, Chua T (2018) Outer product-based neural collaborative filtering. In: International joint conference on artificial intelligence (IJCAI), pp 2227–2233
5.
Zurück zum Zitat He X, Liao L, Zhang H, Nie L, Hu X, Chua T (2017) Neural collaborative filtering. In: International conference on world wide Web (WWW), pp 173–182 He X, Liao L, Zhang H, Nie L, Hu X, Chua T (2017) Neural collaborative filtering. In: International conference on world wide Web (WWW), pp 173–182
6.
7.
Zurück zum Zitat Heo B, Lee M, Yun S, Choi JY (2019) Knowledge transfer via distillation of activation boundaries formed by hidden neurons. In: AAAI conference on artificial intelligence (AAAI), pp 3779–3787 Heo B, Lee M, Yun S, Choi JY (2019) Knowledge transfer via distillation of activation boundaries formed by hidden neurons. In: AAAI conference on artificial intelligence (AAAI), pp 3779–3787
9.
Zurück zum Zitat Hu Y, Koren Y, Volinsky C (2008) Collaborative filtering for implicit feedback datasets. In: IEEE international conference on data mining (ICDM), pp 263–272 Hu Y, Koren Y, Volinsky C (2008) Collaborative filtering for implicit feedback datasets. In: IEEE international conference on data mining (ICDM), pp 263–272
10.
Zurück zum Zitat Kim DH, Park C, Oh J, Lee S, Yu H (2016) Convolutional matrix factorization for document context-aware recommendation. In: ACM conference on recommender systems (RecSys), pp 233–240 Kim DH, Park C, Oh J, Lee S, Yu H (2016) Convolutional matrix factorization for document context-aware recommendation. In: ACM conference on recommender systems (RecSys), pp 233–240
11.
Zurück zum Zitat Li C, Xu T, Zhu J, Zhang B (2017) Triple generative adversarial nets. In: Advances in neural information processing systems (NeurIPS), pp 4088–4098 Li C, Xu T, Zhu J, Zhang B (2017) Triple generative adversarial nets. In: Advances in neural information processing systems (NeurIPS), pp 4088–4098
12.
Zurück zum Zitat Li Y, Hu J, Zhai C, Chen Y (2010) Improving one-class collaborative filtering by incorporating rich user information. In: ACM international conference on information and knowledge management, (CIKM), pp 959–968 Li Y, Hu J, Zhai C, Chen Y (2010) Improving one-class collaborative filtering by incorporating rich user information. In: ACM international conference on information and knowledge management, (CIKM), pp 959–968
13.
Zurück zum Zitat Niu W, Caverlee J, Lu H (2018) Neural personalized ranking for image recommendation. In: ACM international conference on web search and data mining (WSDM), pp 423–431 Niu W, Caverlee J, Lu H (2018) Neural personalized ranking for image recommendation. In: ACM international conference on web search and data mining (WSDM), pp 423–431
14.
Zurück zum Zitat Pan R, Scholz M (2009) Mind the gaps: weighting the unknown in large-scale one-class collaborative filtering. In: ACM international conference on knowledge discovery and data mining (KDD), pp 667–676 Pan R, Scholz M (2009) Mind the gaps: weighting the unknown in large-scale one-class collaborative filtering. In: ACM international conference on knowledge discovery and data mining (KDD), pp 667–676
15.
Zurück zum Zitat Pan R, Zhou Y, Cao B, Liu NN, Lukose RM, Scholz M, Yang Q (2008) One-class collaborative filtering. In: IEEE international conference on data mining (ICDM), pp 502–511 Pan R, Zhou Y, Cao B, Liu NN, Lukose RM, Scholz M, Yang Q (2008) One-class collaborative filtering. In: IEEE international conference on data mining (ICDM), pp 502–511
16.
Zurück zum Zitat Paquet U, Koenigstein N (2013) One-class collaborative filtering with random graphs. In: International world wide web conference (WWW), pp 999–1008 Paquet U, Koenigstein N (2013) One-class collaborative filtering with random graphs. In: International world wide web conference (WWW), pp 999–1008
17.
Zurück zum Zitat Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y (2015) Fitnets: hints for thin deep nets. In: International conference on learning representations (ICLR) Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y (2015) Fitnets: hints for thin deep nets. In: International conference on learning representations (ICLR)
18.
Zurück zum Zitat Sedhain S, Menon AK, Sanner S, Xie L (2015) Autorec: autoencoders meet collaborative filtering. In: International conference on world wide web companion (WWW), pp 111–112 Sedhain S, Menon AK, Sanner S, Xie L (2015) Autorec: autoencoders meet collaborative filtering. In: International conference on world wide web companion (WWW), pp 111–112
19.
Zurück zum Zitat Sindhwani V, Bucak SS, Hu J, Mojsilovic A (2010) One-class matrix completion with low-density factorizations. In: IEEE international conference on data mining (ICDM), pp 1055–1060 Sindhwani V, Bucak SS, Hu J, Mojsilovic A (2010) One-class matrix completion with low-density factorizations. In: IEEE international conference on data mining (ICDM), pp 1055–1060
20.
Zurück zum Zitat Srinivas S, Fleuret F (2018) Knowledge transfer with jacobian matching. In: International conference on machine learning (ICML), pp 4730–4738 Srinivas S, Fleuret F (2018) Knowledge transfer with jacobian matching. In: International conference on machine learning (ICML), pp 4730–4738
21.
Zurück zum Zitat Tang J, Wang K (2018) Personalized top-n sequential recommendation via convolutional sequence embedding. In: ACM international conference on web search and data mining (WSDM), pp 565–573 Tang J, Wang K (2018) Personalized top-n sequential recommendation via convolutional sequence embedding. In: ACM international conference on web search and data mining (WSDM), pp 565–573
22.
Zurück zum Zitat Tang J, Wang K (2018) Ranking distillation: Learning compact ranking models with high performance for recommender system. In: ACM SIGKDD international conference on knowledge discovery & data mining (KDD), pp 2289–2298 Tang J, Wang K (2018) Ranking distillation: Learning compact ranking models with high performance for recommender system. In: ACM SIGKDD international conference on knowledge discovery & data mining (KDD), pp 2289–2298
23.
Zurück zum Zitat Wang H, Wang N, Yeung D (2015) Collaborative deep learning for recommender systems. In: ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 1235–1244 Wang H, Wang N, Yeung D (2015) Collaborative deep learning for recommender systems. In: ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 1235–1244
24.
Zurück zum Zitat Wang X, Zhang R, Sun Y, Qi J (2018) KDGAN: knowledge distillation with generative adversarial networks. In: Annual conference on neural information processing systems (NeurIPS), pp 783–794 Wang X, Zhang R, Sun Y, Qi J (2018) KDGAN: knowledge distillation with generative adversarial networks. In: Annual conference on neural information processing systems (NeurIPS), pp 783–794
25.
Zurück zum Zitat Wang Y, Xu C, Xu C, Tao D (2018) Adversarial learning of portable student networks. In: AAAI conference on artificial intelligence (AAAI), pp 4260–4267 Wang Y, Xu C, Xu C, Tao D (2018) Adversarial learning of portable student networks. In: AAAI conference on artificial intelligence (AAAI), pp 4260–4267
26.
Zurück zum Zitat Wu Y, DuBois C, Zheng AX, Ester M (2016) Collaborative denoising auto-encoders for top-n recommender systems. In: ACM international conference on web search and data mining (WSDM), pp 153–162 Wu Y, DuBois C, Zheng AX, Ester M (2016) Collaborative denoising auto-encoders for top-n recommender systems. In: ACM international conference on web search and data mining (WSDM), pp 153–162
27.
Zurück zum Zitat Xu Z, Hsu Y, Huang J (2018) Training shallow and thin networks for acceleration via knowledge distillation with conditional adversarial networks. In: International conference on learning representations (ICLR) Xu Z, Hsu Y, Huang J (2018) Training shallow and thin networks for acceleration via knowledge distillation with conditional adversarial networks. In: International conference on learning representations (ICLR)
28.
Zurück zum Zitat Xue H, Dai X, Zhang J, Huang S, Chen J (2017) Deep matrix factorization models for recommender systems. In: International joint conference on artificial intelligence (IJCAI), pp 3203–3209 Xue H, Dai X, Zhang J, Huang S, Chen J (2017) Deep matrix factorization models for recommender systems. In: International joint conference on artificial intelligence (IJCAI), pp 3203–3209
29.
Zurück zum Zitat Yang C, Xie L, Qiao S, Yuille AL (2019) Training deep neural networks in generations: a more tolerant teacher educates better students. In: The thirty-third AAAI conference on artificial intelligence (AAAI), pp 5628–5635 Yang C, Xie L, Qiao S, Yuille AL (2019) Training deep neural networks in generations: a more tolerant teacher educates better students. In: The thirty-third AAAI conference on artificial intelligence (AAAI), pp 5628–5635
30.
Zurück zum Zitat Yao Y, Tong H, Yan G, Xu F, Zhang X, Szymanski BK, Lu J (2014) Dual-regularized one-class collaborative filtering. In: ACM international conference on conference on information and knowledge management (CIKM), pp 759–768 Yao Y, Tong H, Yan G, Xu F, Zhang X, Szymanski BK, Lu J (2014) Dual-regularized one-class collaborative filtering. In: ACM international conference on conference on information and knowledge management (CIKM), pp 759–768
31.
Zurück zum Zitat Yim J, Joo D, Bae J, Kim J (2017) A gift from knowledge distillation: Fast optimization, network minimization and transfer learning. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 7130–7138 Yim J, Joo D, Bae J, Kim J (2017) A gift from knowledge distillation: Fast optimization, network minimization and transfer learning. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 7130–7138
32.
Zurück zum Zitat Zagoruyko S, Komodakis N (2017) Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. In: International conference on learning representations (ICLR) Zagoruyko S, Komodakis N (2017) Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. In: International conference on learning representations (ICLR)
33.
Zurück zum Zitat Zheng X, Ding H, Mamitsuka H, Zhu S (2013) Collaborative matrix factorization with multiple similarities for predicting drug-target interactions. In: ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 1025–1033 Zheng X, Ding H, Mamitsuka H, Zhu S (2013) Collaborative matrix factorization with multiple similarities for predicting drug-target interactions. In: ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 1025–1033
Metadaten
Titel
Knowledge distillation meets recommendation: collaborative distillation for top-N recommendation
verfasst von
Jae-woong Lee
Minjin Choi
Lee Sael
Hyunjung Shim
Jongwuk Lee
Publikationsdatum
20.04.2022
Verlag
Springer London
Erschienen in
Knowledge and Information Systems / Ausgabe 5/2022
Print ISSN: 0219-1377
Elektronische ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-022-01667-8

Weitere Artikel der Ausgabe 5/2022

Knowledge and Information Systems 5/2022 Zur Ausgabe