Skip to main content

2017 | OriginalPaper | Buchkapitel

Hedging Under Uncertainty: Regret Minimization Meets Exponentially Fast Convergence

verfasst von : Johanne Cohen, Amélie Héliou, Panayotis Mertikopoulos

Erschienen in: Algorithmic Game Theory

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper examines the problem of multi-agent learning in \(N\)-person non-cooperative games. For concreteness, we focus on the so-called “hedge” variant of the (EW) algorithm, one of the most widely studied algorithmic schemes for regret minimization in online learning. In this multi-agent context, we show that (a) dominated strategies become extinct (a.s.); and (b) in generic games, pure Nash equilibria are attracting with high probability, even in the presence of uncertainty and noise of arbitrarily high variance. Moreover, if the algorithm’s step-size does not decay too fast, we show that these properties occur at a quasi-exponential rate – that is, much faster than the algorithm’s \({{\mathrm{\mathcal O}}}(1/\sqrt{T})\) worst-case regret guarantee would suggest.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
As the second name suggests, this set contains the game’s set of correlated equilibria (CE) – and hence, the game’s set of Nash equilibria as well.
 
2
In the above \((\alpha _{i};\alpha _{-i})\) is shorthand for \((\alpha _{1},\ldots ,\alpha _{i},\ldots ,\alpha _{N})\), used here to highlight the action of player \(i\) against that of all other players.
 
3
Note here that (10a) is phrased in terms of the players’ mixed strategy profile \(x(t)\), not the action profile \(\alpha (t) = (\alpha _{i}(t);\alpha _{-i}(t))\) which is chosen based on \(x(t)\) at stage \(t\). To see that (H1) indeed implies (10a), simply recall that \(x_{i}(t) = {{\mathrm{\Lambda }}}_{i}(y_{i}(t-1))\) so .
 
4
To the best of our knowledge, the closest result is the elimination of dominated strategies under exponential learning in continuous time [20]; for a survey of the relevant literature, see [25].
 
5
Recall also the counterexample of [26] discussed in the introduction.
 
6
This can be shown by an induction argument on the rounds of elimination of dominated strategies as in [18].
 
Literatur
1.
Zurück zum Zitat Arora, S., Hazan, E., Kale, S.: The multiplicative weights update method: a meta-algorithm and applications. Theory Comput. 8(1), 121–164 (2012)MathSciNetCrossRef Arora, S., Hazan, E., Kale, S.: The multiplicative weights update method: a meta-algorithm and applications. Theory Comput. 8(1), 121–164 (2012)MathSciNetCrossRef
2.
Zurück zum Zitat Blum, A., Hajiaghayi, M.T., Ligett, K., Roth, A.: Regret minimization and the price of total anarchy. In: STOC 2008: Proceedings of the 40th Annual ACM Symposium on the Theory of Computing, pp. 373–382. ACM (2008) Blum, A., Hajiaghayi, M.T., Ligett, K., Roth, A.: Regret minimization and the price of total anarchy. In: STOC 2008: Proceedings of the 40th Annual ACM Symposium on the Theory of Computing, pp. 373–382. ACM (2008)
3.
Zurück zum Zitat Blum, A., Mansour, Y.: Learning, regret minimization, and equilibria (Chap. 4). In: Nisan, N., Roughgarden, T., Tardos, E., Vazirani, V.V. (eds.) Algorithmic Game Theory. Cambridge University Press, Cambridge (2007) Blum, A., Mansour, Y.: Learning, regret minimization, and equilibria (Chap. 4). In: Nisan, N., Roughgarden, T., Tardos, E., Vazirani, V.V. (eds.) Algorithmic Game Theory. Cambridge University Press, Cambridge (2007)
4.
Zurück zum Zitat Bubeck, S., Cesa-Bianchi, N.: Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Found. Trends Mach. Learn. 5(1), 1–122 (2012)CrossRef Bubeck, S., Cesa-Bianchi, N.: Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Found. Trends Mach. Learn. 5(1), 1–122 (2012)CrossRef
5.
Zurück zum Zitat Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, Cambridge (2006)CrossRef Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, Cambridge (2006)CrossRef
6.
Zurück zum Zitat Coucheney, P., Gaujal, B., Mertikopoulos, P.: Penalty-regulated dynamics and robust learning procedures in games. Math. Oper. Res. 40(3), 611–633 (2015)MathSciNetCrossRef Coucheney, P., Gaujal, B., Mertikopoulos, P.: Penalty-regulated dynamics and robust learning procedures in games. Math. Oper. Res. 40(3), 611–633 (2015)MathSciNetCrossRef
7.
Zurück zum Zitat Foster, D., Vohra, R.V.: Calibrated learning and correlated equilibrium. Games Econ. Behav. 21(1), 40–55 (1997)MathSciNetCrossRef Foster, D., Vohra, R.V.: Calibrated learning and correlated equilibrium. Games Econ. Behav. 21(1), 40–55 (1997)MathSciNetCrossRef
8.
Zurück zum Zitat Foster, D.J., Lykouris, T., Sridharan, K., Tardos, E.: Learning in games: robustness of fast convergence. In: Advances in Neural Information Processing Systems, pp. 4727–4735 (2016) Foster, D.J., Lykouris, T., Sridharan, K., Tardos, E.: Learning in games: robustness of fast convergence. In: Advances in Neural Information Processing Systems, pp. 4727–4735 (2016)
9.
Zurück zum Zitat Freund, Y., Schapire, R.E.: Adaptive game playing using multiplicative weights. Games Econ. Behav. 29, 79–103 (1999)MathSciNetCrossRef Freund, Y., Schapire, R.E.: Adaptive game playing using multiplicative weights. Games Econ. Behav. 29, 79–103 (1999)MathSciNetCrossRef
10.
Zurück zum Zitat Goldberg, P.W., Roth, A.: Bounds for the query complexity of approximate equilibria. ACM Trans. Econ. Comput. 4(4), 24:1–24:25 (2016)MathSciNetCrossRef Goldberg, P.W., Roth, A.: Bounds for the query complexity of approximate equilibria. ACM Trans. Econ. Comput. 4(4), 24:1–24:25 (2016)MathSciNetCrossRef
11.
Zurück zum Zitat Hall, P., Heyde, C.C.: Martingale Limit Theory and Its Application. Probability and Mathematical Statistics. Academic Press, New York (1980)MATH Hall, P., Heyde, C.C.: Martingale Limit Theory and Its Application. Probability and Mathematical Statistics. Academic Press, New York (1980)MATH
12.
Zurück zum Zitat Hannan, J.: Approximation to Bayes risk in repeated play. In: Dresher, M., Tucker, A.W., Wolfe, P. (eds.) Contributions to the Theory of Games. Annals of Mathematics Studies, vol. 39, pp. 97–139. Princeton University Press, Princeton (1957)MATH Hannan, J.: Approximation to Bayes risk in repeated play. In: Dresher, M., Tucker, A.W., Wolfe, P. (eds.) Contributions to the Theory of Games. Annals of Mathematics Studies, vol. 39, pp. 97–139. Princeton University Press, Princeton (1957)MATH
13.
Zurück zum Zitat Hart, S., Mas-Colell, A.: A simple adaptive procedure leading to correlated equilibrium. Econometrica 68(5), 1127–1150 (2000)MathSciNetCrossRef Hart, S., Mas-Colell, A.: A simple adaptive procedure leading to correlated equilibrium. Econometrica 68(5), 1127–1150 (2000)MathSciNetCrossRef
14.
Zurück zum Zitat Kalai, A., Vempala, S.: Efficient algorithms for online decision problems. J. Comput. Syst. Sci. 71(3), 291–307 (2005)MathSciNetCrossRef Kalai, A., Vempala, S.: Efficient algorithms for online decision problems. J. Comput. Syst. Sci. 71(3), 291–307 (2005)MathSciNetCrossRef
15.
Zurück zum Zitat Kleinberg, R., Piliouras, G., Tardos, É.: Load balancing without regret in the bulletin board model. Distrib. Comput. 24(1), 21–29 (2011)CrossRef Kleinberg, R., Piliouras, G., Tardos, É.: Load balancing without regret in the bulletin board model. Distrib. Comput. 24(1), 21–29 (2011)CrossRef
16.
Zurück zum Zitat Krichene, W., Drighès, B., Bayen, A.M.: Learning Nash equilibria in congestion games. arXiv preprint arXiv:1408.0017 (2014) Krichene, W., Drighès, B., Bayen, A.M.: Learning Nash equilibria in congestion games. arXiv preprint arXiv:​1408.​0017 (2014)
17.
18.
19.
20.
Zurück zum Zitat Mertikopoulos, P., Moustakas, A.L.: The emergence of rational behavior in the presence of stochastic perturbations. Ann. Appl. Probab. 20(4), 1359–1388 (2010)MathSciNetCrossRef Mertikopoulos, P., Moustakas, A.L.: The emergence of rational behavior in the presence of stochastic perturbations. Ann. Appl. Probab. 20(4), 1359–1388 (2010)MathSciNetCrossRef
21.
Zurück zum Zitat Mertikopoulos, P., Sandholm, W.H.: Learning in games via reinforcement and regularization. Math. Oper. Res. 41(4), 1297–1324 (2016)MathSciNetCrossRef Mertikopoulos, P., Sandholm, W.H.: Learning in games via reinforcement and regularization. Math. Oper. Res. 41(4), 1297–1324 (2016)MathSciNetCrossRef
23.
Zurück zum Zitat Sandholm, W.H.: Population Games and Evolutionary Dynamics. Economic Learning and Social Evolution. MIT Press, Cambridge (2010)MATH Sandholm, W.H.: Population Games and Evolutionary Dynamics. Economic Learning and Social Evolution. MIT Press, Cambridge (2010)MATH
24.
Zurück zum Zitat Syrgkanis, V., Agarwal, A., Luo, H., Schapire, R.E.: Fast convergence of regularized learning in games. In: Advances in Neural Information Processing Systems, pp. 2989–2997 (2015) Syrgkanis, V., Agarwal, A., Luo, H., Schapire, R.E.: Fast convergence of regularized learning in games. In: Advances in Neural Information Processing Systems, pp. 2989–2997 (2015)
25.
26.
Zurück zum Zitat Viossat, Y., Zapechelnyuk, A.: No-regret dynamics and fictitious play. J. Econ. Theory 148(2), 825–842 (2013)MathSciNetCrossRef Viossat, Y., Zapechelnyuk, A.: No-regret dynamics and fictitious play. J. Econ. Theory 148(2), 825–842 (2013)MathSciNetCrossRef
27.
Zurück zum Zitat Vovk, V.G.: Aggregating strategies. In: COLT 1990: Proceedings of the 3rd Workshop on Computational Learning Theory, pp. 371–383 (1990) Vovk, V.G.: Aggregating strategies. In: COLT 1990: Proceedings of the 3rd Workshop on Computational Learning Theory, pp. 371–383 (1990)
28.
Zurück zum Zitat Weibull, J.W.: Evolutionary Game Theory. MIT Press, Cambridge (1995)MATH Weibull, J.W.: Evolutionary Game Theory. MIT Press, Cambridge (1995)MATH
Metadaten
Titel
Hedging Under Uncertainty: Regret Minimization Meets Exponentially Fast Convergence
verfasst von
Johanne Cohen
Amélie Héliou
Panayotis Mertikopoulos
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-66700-3_20