Skip to main content

2018 | OriginalPaper | Buchkapitel

Con-CNAME: A Contextual Multi-armed Bandit Algorithm for Personalized Recommendations

verfasst von : Xiaofang Zhang, Qian Zhou, Tieke He, Bin Liang

Erschienen in: Artificial Neural Networks and Machine Learning – ICANN 2018

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Reinforcement learning algorithms play an important role in modern day and have been applied to many domains. For example, personalized recommendations problem can be modelled as a contextual multi-armed bandit problem in reinforcement learning. In this paper, we propose a contextual bandit algorithm which is based on Contexts and the Chosen Number of Arm with Minimal Estimation, namely Con-CNAME in short. The continuous exploration and context used in our algorithm can address the cold start problem in recommender systems. Furthermore, the Con-CNAME algorithm can still make recommendations under the emergency circumstances where contexts are unavailable suddenly. In the experimental evaluation, the reference range of key parameters and the stability of Con-CNAME are discussed in detail. In addition, the performance of Con-CNAME is compared with some classic algorithms. Experimental results show that our algorithm outperforms several bandit algorithms.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Sutton, R.S., Barto, A.G.: Introduction to reinforcement learning. Mach. Learn. 16(1), 285–286 (2005) Sutton, R.S., Barto, A.G.: Introduction to reinforcement learning. Mach. Learn. 16(1), 285–286 (2005)
2.
Zurück zum Zitat Li, S., Karatzoglou, A., Gentile, C.: Collaborative filtering bandits. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 539–548 (2016) Li, S., Karatzoglou, A., Gentile, C.: Collaborative filtering bandits. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 539–548 (2016)
3.
Zurück zum Zitat Eghbali, S., Ashtiani, M.H.Z., Ahmadabadi, M.N., et al.: Bandit-based structure learning for bayesian network classifiers. In: International Conference on Neural Information Processing, pp. 349–356 (2012)CrossRef Eghbali, S., Ashtiani, M.H.Z., Ahmadabadi, M.N., et al.: Bandit-based structure learning for bayesian network classifiers. In: International Conference on Neural Information Processing, pp. 349–356 (2012)CrossRef
4.
Zurück zum Zitat Resnick, P., Varian, H.R.: Recommender systems. Commun. ACM 40(3), 56–58 (1997)CrossRef Resnick, P., Varian, H.R.: Recommender systems. Commun. ACM 40(3), 56–58 (1997)CrossRef
5.
Zurück zum Zitat Balabanović, M., Shoham, Y.: Fab: content-based, collaborative recommendation. Commun. ACM 40(3), 66–72 (1997)CrossRef Balabanović, M., Shoham, Y.: Fab: content-based, collaborative recommendation. Commun. ACM 40(3), 66–72 (1997)CrossRef
6.
Zurück zum Zitat Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47(2), 235–256 (2002)CrossRef Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47(2), 235–256 (2002)CrossRef
7.
Zurück zum Zitat Liu, J., Dolan, P., Pedersen, E.R.: Personalized news recommendation based on click behavior. In: International Conference on Intelligent User Interfaces, pp. 31–40 (2010) Liu, J., Dolan, P., Pedersen, E.R.: Personalized news recommendation based on click behavior. In: International Conference on Intelligent User Interfaces, pp. 31–40 (2010)
8.
Zurück zum Zitat Dhanda, M., Verma, V.: Personalized recommendation approach for academic literature using high-utility itemset mining technique. Progress in Intelligent Computing Techniques: Theory, Practice, and Applications (2018) Dhanda, M., Verma, V.: Personalized recommendation approach for academic literature using high-utility itemset mining technique. Progress in Intelligent Computing Techniques: Theory, Practice, and Applications (2018)
9.
Zurück zum Zitat Schein, A.I., Popescul, A., Ungar, L.H., et al.: Methods and metrics for cold-start recommendations. In: Proceedings of ACM SIGIR Conference on Research & Development in Information Retrieval, vol. 39(5), 253–260 (2002) Schein, A.I., Popescul, A., Ungar, L.H., et al.: Methods and metrics for cold-start recommendations. In: Proceedings of ACM SIGIR Conference on Research & Development in Information Retrieval, vol. 39(5), 253–260 (2002)
10.
Zurück zum Zitat Mary, J., Gaudel, R., Philippe, P.: Bandits warm-up cold recommender systems. Computer Science (2014) Mary, J., Gaudel, R., Philippe, P.: Bandits warm-up cold recommender systems. Computer Science (2014)
11.
Zurück zum Zitat Tang, L., Jiang, Y., Li, L., Li, T.: Ensemble contextual bandits for personalized recommendation. In: RecSys, pp. 73–80 (2014) Tang, L., Jiang, Y., Li, L., Li, T.: Ensemble contextual bandits for personalized recommendation. In: RecSys, pp. 73–80 (2014)
12.
Zurück zum Zitat Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 17(6), 734–749 (2005)CrossRef Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 17(6), 734–749 (2005)CrossRef
13.
Zurück zum Zitat Shani, G., Heckerman, D., Brafman, R.I.: An MDP-based recommender system. J. Mach. Learn. Res. 6(1), 1265–1295 (2005)MathSciNetMATH Shani, G., Heckerman, D., Brafman, R.I.: An MDP-based recommender system. J. Mach. Learn. Res. 6(1), 1265–1295 (2005)MathSciNetMATH
14.
Zurück zum Zitat Ren, Z., Krogh, B.H.: State aggregation in markov decision processes. In: IEEE Conference on Decision and Control, pp. 3819–3824 (2002) Ren, Z., Krogh, B.H.: State aggregation in markov decision processes. In: IEEE Conference on Decision and Control, pp. 3819–3824 (2002)
15.
Zurück zum Zitat Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, Cambridge (2006)CrossRef Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, Cambridge (2006)CrossRef
16.
Zurück zum Zitat Cesa-Bianchi, N., Fischer, P.: Finite-time regret bounds for the multi-armed bandit problem. In: ICML, pp. 100–108 (1998) Cesa-Bianchi, N., Fischer, P.: Finite-time regret bounds for the multi-armed bandit problem. In: ICML, pp. 100–108 (1998)
17.
Zurück zum Zitat Bubeck, S., Slivkins, A.: The best of both worlds: stochastic and adversarial bandits. J. Mach. Learn. Res. 23(42), 1–23 (2012) Bubeck, S., Slivkins, A.: The best of both worlds: stochastic and adversarial bandits. J. Mach. Learn. Res. 23(42), 1–23 (2012)
18.
Zurück zum Zitat Adomavicius, G., Tuzhilin, A.: Context-aware recommender systems. In: Recommender Systems Handbook, pp. 191–226 (2015)CrossRef Adomavicius, G., Tuzhilin, A.: Context-aware recommender systems. In: Recommender Systems Handbook, pp. 191–226 (2015)CrossRef
19.
Zurück zum Zitat Adomavicius, G., Sankaranarayanan, R., Sen, S., Tuzhilin, A.: Incorporating contextual information in recommender systems using a multidimensional approach. ACM Trans. Inf. Syst. 23(1), 103–145 (2005)CrossRef Adomavicius, G., Sankaranarayanan, R., Sen, S., Tuzhilin, A.: Incorporating contextual information in recommender systems using a multidimensional approach. ACM Trans. Inf. Syst. 23(1), 103–145 (2005)CrossRef
20.
Zurück zum Zitat Li, L., Chu, W., Langford, J., Schapire, R. E.: A contextual-bandit approach to personalized news article recommendation. In: World Wide Web, pp. 661–670 (2010) Li, L., Chu, W., Langford, J., Schapire, R. E.: A contextual-bandit approach to personalized news article recommendation. In: World Wide Web, pp. 661–670 (2010)
21.
Zurück zum Zitat Song, L., Tekin, C., Schaar, M.V.D.: Online learning in large-scale contextual recommender systems. IEEE Trans. Serv. Comput. 9(3), 433–445 (2016)CrossRef Song, L., Tekin, C., Schaar, M.V.D.: Online learning in large-scale contextual recommender systems. IEEE Trans. Serv. Comput. 9(3), 433–445 (2016)CrossRef
22.
Zurück zum Zitat Jośe, A.M.H., Vargas, A.M.: Linear bayes policy for learning in contextual-bandits. Expert Syst. Appl. 40(18), 7400–7406 (2013)CrossRef Jośe, A.M.H., Vargas, A.M.: Linear bayes policy for learning in contextual-bandits. Expert Syst. Appl. 40(18), 7400–7406 (2013)CrossRef
23.
Zurück zum Zitat Zhou, Q., Zhang, X.F, Xu, J., et al.: Large-scale bandit approaches for recommender systems. In: International Conference on Neural Information Processing, pp. 811–821 (2017)CrossRef Zhou, Q., Zhang, X.F, Xu, J., et al.: Large-scale bandit approaches for recommender systems. In: International Conference on Neural Information Processing, pp. 811–821 (2017)CrossRef
Metadaten
Titel
Con-CNAME: A Contextual Multi-armed Bandit Algorithm for Personalized Recommendations
verfasst von
Xiaofang Zhang
Qian Zhou
Tieke He
Bin Liang
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-030-01421-6_32