Skip to main content

2015 | OriginalPaper | Buchkapitel

Bandits and Recommender Systems

verfasst von : Jérémie Mary, Romaric Gaudel, Philippe Preux

Erschienen in: Machine Learning, Optimization, and Big Data

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper addresses the on-line recommendation problem facing new users and new items; we assume that no information is available neither about users, nor about the items. The only source of information is a set of ratings given by users to some items. By on-line, we mean that the set of users, and the set of items, and the set of ratings is evolving along time and that at any moment, the recommendation system has to select items to recommend based on the currently available information, that is basically the sequence of past events. We also mean that each user comes with her preferences which may evolve along short and longer scales of time; so we have to continuously update their preferences. When the set of ratings is the only available source of information, the traditional approach is matrix factorization. In a decision making under uncertainty setting, actions should be selected to balance exploration with exploitation; this is best modeled as a bandit problem. Matrix factors provide a latent representation of users and items. These representations may then be used as contextual information by the bandit algorithm to select items. This last point is exactly the originality of this paper: the combination of matrix factorization and bandit algorithms to solve the on-line recommendation problem. Our work is driven by considering the recommendation problem as a feedback controlled loop. This leads to interactions between the representation learning, and the recommendation policy.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
\(\tilde{O}\) means O up to a logarithmic term on T.
 
Literatur
1.
Zurück zum Zitat Abbasi-yadkori, Y., Pal, D., Szepesvari, C.: Improved algorithms for linear stochastic bandits. In: Proceedings of NIPS, pp. 2312–2320 (2011) Abbasi-yadkori, Y., Pal, D., Szepesvari, C.: Improved algorithms for linear stochastic bandits. In: Proceedings of NIPS, pp. 2312–2320 (2011)
2.
Zurück zum Zitat Agarwal, D., Chen, B.-C., Elango, P., Motgi, N., Park, S.-T., Ramakrishnan, R., Roy, S., Zachariah, J.: Online models for content optimization. In: Proceedings of NIPS, pp. 17–24 (2008) Agarwal, D., Chen, B.-C., Elango, P., Motgi, N., Park, S.-T., Ramakrishnan, R., Roy, S., Zachariah, J.: Online models for content optimization. In: Proceedings of NIPS, pp. 17–24 (2008)
3.
Zurück zum Zitat Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47, 235–256 (2002)MATHCrossRef Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47, 235–256 (2002)MATHCrossRef
4.
Zurück zum Zitat Bennett, J., Lanning, S., Netflix, N.: The Netflix prize. In: KDD Cup and Workshop (2007) Bennett, J., Lanning, S., Netflix, N.: The Netflix prize. In: KDD Cup and Workshop (2007)
7.
Zurück zum Zitat Dhanjal, C., Gaudel, R., Clémençon, S.: Collaborative filtering with localised ranking. In: Proceedings of AAAI (2015) Dhanjal, C., Gaudel, R., Clémençon, S.: Collaborative filtering with localised ranking. In: Proceedings of AAAI (2015)
8.
Zurück zum Zitat Dror, G., Koenigstein, N., Koren, Y., Weimer, M.: The Yahoo! music dataset and kdd-cup 2011. In: Proceedings of KDD Cup (2011) Dror, G., Koenigstein, N., Koren, Y., Weimer, M.: The Yahoo! music dataset and kdd-cup 2011. In: Proceedings of KDD Cup (2011)
10.
Zurück zum Zitat Kohli, P., Salek, M., Stoddard, G.: A fast bandit algorithm for recommendations to users with heterogeneous tastes. In: Proceedings of AAAI, pp. 1135–1141 (2013) Kohli, P., Salek, M., Stoddard, G.: A fast bandit algorithm for recommendations to users with heterogeneous tastes. In: Proceedings of AAAI, pp. 1135–1141 (2013)
11.
Zurück zum Zitat Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender systems. Computer 42(8), 30–37 (2009)CrossRef Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender systems. Computer 42(8), 30–37 (2009)CrossRef
12.
Zurück zum Zitat Langford, J., Strehl, A., Wortman, J.: Exploration scavenging. In: Proceedings of ICML, pp. 528–535. Omnipress (2008) Langford, J., Strehl, A., Wortman, J.: Exploration scavenging. In: Proceedings of ICML, pp. 528–535. Omnipress (2008)
13.
Zurück zum Zitat Li, L., Chu, W., Langford, J., Schapire, R.E.: A contextual-bandit approach to personalized news article recommendation. In: Proceedings of WWW, pp. 661–670. ACM, New York (2010) Li, L., Chu, W., Langford, J., Schapire, R.E.: A contextual-bandit approach to personalized news article recommendation. In: Proceedings of WWW, pp. 661–670. ACM, New York (2010)
14.
Zurück zum Zitat Li, L., Chu, W., Langford, J., Wang, X.: Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms. In: Proceedings of WSDM, pp. 297–306. ACM (2011) Li, L., Chu, W., Langford, J., Wang, X.: Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms. In: Proceedings of WSDM, pp. 297–306. ACM (2011)
15.
Zurück zum Zitat Mary, J., Garivier, A., Li, L., Munos, R., Nicol, O., Ortner, R., Preux, P.: ICML exploration and exploitation 3 - new challenges (2012) Mary, J., Garivier, A., Li, L., Munos, R., Nicol, O., Ortner, R., Preux, P.: ICML exploration and exploitation 3 - new challenges (2012)
16.
Zurück zum Zitat Shani, G., Heckerman, D., Brafman, R.I.: An MDP-based recommender system. J. Mach. Learn. Res. 6, 1265–1295 (2005)MATHMathSciNet Shani, G., Heckerman, D., Brafman, R.I.: An MDP-based recommender system. J. Mach. Learn. Res. 6, 1265–1295 (2005)MATHMathSciNet
17.
Zurück zum Zitat Shivaswamy, P.K., Joachims, T.: Online learning with preference feedback. In: NIPS Workshop on Choice Models and Preference Learning (2011) Shivaswamy, P.K., Joachims, T.: Online learning with preference feedback. In: NIPS Workshop on Choice Models and Preference Learning (2011)
18.
Zurück zum Zitat Walsh, T.J., Szita, I., Diuk, C., Littman, M.L.: Exploring compact reinforcement-learning representations with linear regression (2012). CoRR abs/1205.2606 Walsh, T.J., Szita, I., Diuk, C., Littman, M.L.: Exploring compact reinforcement-learning representations with linear regression (2012). CoRR abs/​1205.​2606
19.
Zurück zum Zitat Weston, J., Yee, H., Weiss, R.J.: Learning to rank recommendations with the k-order statistic loss. In: Proceedings of RecSys, pp. 245–248. ACM (2013) Weston, J., Yee, H., Weiss, R.J.: Learning to rank recommendations with the k-order statistic loss. In: Proceedings of RecSys, pp. 245–248. ACM (2013)
20.
Zurück zum Zitat White, J.M.: Bandit Algorithms for Website Optimization. O’Reilly, USA (2012) White, J.M.: Bandit Algorithms for Website Optimization. O’Reilly, USA (2012)
21.
Zurück zum Zitat Yue, Y., Hong, S.A., Guestrin, C.: Hierarchical exploration for accelerating contextual bandits. In: Proceedings of ICML, pp. 1895–1902 (2012) Yue, Y., Hong, S.A., Guestrin, C.: Hierarchical exploration for accelerating contextual bandits. In: Proceedings of ICML, pp. 1895–1902 (2012)
22.
Zurück zum Zitat Zhou, Y., Wilkinson, D., Schreiber, R., Pan, R.: Large-scale parallel collaborative filtering for the netflix prize. In: Fleischer, R., Xu, J. (eds.) AAIM 2008. LNCS, vol. 5034, pp. 337–348. Springer, Heidelberg (2008) CrossRef Zhou, Y., Wilkinson, D., Schreiber, R., Pan, R.: Large-scale parallel collaborative filtering for the netflix prize. In: Fleischer, R., Xu, J. (eds.) AAIM 2008. LNCS, vol. 5034, pp. 337–348. Springer, Heidelberg (2008) CrossRef
Metadaten
Titel
Bandits and Recommender Systems
verfasst von
Jérémie Mary
Romaric Gaudel
Philippe Preux
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-27926-8_29