Skip to main content
Erschienen in: Dynamic Games and Applications 1/2016

01.03.2016

A General Internal Regret-Free Strategy

verfasst von: Ehud Lehrer, Eilon Solan

Erschienen in: Dynamic Games and Applications | Ausgabe 1/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We study sequential decision problems where the decision maker does not observe the states of nature, but rather receives a noisy signal, whose distribution depends on the current state and on the action that she plays. We do not assume that the decision maker considers the worst-case scenario, but rather has a response correspondence, which maps distributions over signals to subjective best responses. We extend the concept of internal regret-free strategy to this setup and provide an algorithm that generates such a strategy.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
For a finite set \(Z\), the set of probability distributions over \(Z\) is denoted by \(\Delta (Z)\).
 
2
Unless indicated otherwise, we use the supremum norm: for every two vectors \(x,x' \in {\mathbb {R}}^d\), the distance between \(x\) and \(x'\) is \(d(x,x') = \max \{ |x_i-x'_i|, 1 \le i \le d\}\), and for every subset \(D \subseteq {\mathbb {R}}^d\), the distance between \(x\) and \(D\) is \(d(x,D) := \inf _{x' \in D} d(x,x')\). Likewise, the distance between two subsets \(Y_1\) and \(Y_2\) of \(Y\) is given by \(d(Y_1,Y_2) := \max _{y_1 \in Y_1} \min _{y_2 \in Y_2}d(y_1,y_2)\).
 
3
We say that event \(B\) holds on event \(A\) if \({\mathbb {P}}(A \cap B) = {\mathbb {P}}(A)\). Thus, a certain inequality holds on event \(A\) if the probability of all points in \(A\) that do not satisfy it is 0.
 
4
For convenience, in the examples we allow payoffs to be greater than \(1\).
 
5
The compactness of \(Y(\mu )\) and the concavity of the entropy function ensure that \(y^\mathrm{ENT}_{\mu }\) is well defined.
 
6
Because the range of \(\sigma \) is finite, we can omit the dependency of \(T'\) on \(x\).
 
7
[19] refer to Blackwell’s invited address to the Institute of Mathematical Statistics, Seattle, August 1956, entitled “Controlled random walks”.
 
8
In most applications of Blackwell’s theory in game theory, the payoffs are uniquely determined by the actions of the players. In our proof, as in Blackwell’s original paper, the payoffs are random variables.
 
9
For any two vectors \(a,b\in {\mathbb {R}}^d\), we denote by \(a \cdot b\) the coordinate-wise product: \((a\cdot b)_k:=a_kb_k\) for every \(k\). Similarly, \(\frac{a}{b} \) denotes the coordinate-wise quotient of \(a\) and \(b\): \((\frac{a}{b})_k := \frac{a_k}{b_k}\) for every \(k\).
 
Literatur
1.
Zurück zum Zitat Aumann RJ, Maschler M (1995) Repeated games with incomplete information. MIT Press, CambridgeMATH Aumann RJ, Maschler M (1995) Repeated games with incomplete information. MIT Press, CambridgeMATH
2.
Zurück zum Zitat Bartók G, Foster D, Pál D, Rakhlin A, Szepesvári C (2013) Partial monitoring-classification, regret bounds, and algorithms, preprint Bartók G, Foster D, Pál D, Rakhlin A, Szepesvári C (2013) Partial monitoring-classification, regret bounds, and algorithms, preprint
4.
5.
Zurück zum Zitat Cesa-Bianchi N, Lugosi G (2006) Prediction, learning, and games. Cambridge University Press, CambridgeCrossRefMATH Cesa-Bianchi N, Lugosi G (2006) Prediction, learning, and games. Cambridge University Press, CambridgeCrossRefMATH
7.
Zurück zum Zitat Foster DP (1999) A proof of calibration via Blackwell’s approachability theorem. Games Econ Behav 29:73–78CrossRefMATH Foster DP (1999) A proof of calibration via Blackwell’s approachability theorem. Games Econ Behav 29:73–78CrossRefMATH
12.
Zurück zum Zitat Hannan J (1957) Approximation to Bayes risk in repeated play. Contrib Theory Games 3:97–139MATH Hannan J (1957) Approximation to Bayes risk in repeated play. Contrib Theory Games 3:97–139MATH
13.
15.
Zurück zum Zitat Hazan E, Kakade SM (2012) (weak) Calibration is computationally hard. In: Conference on learning theory (COLT) 2012 Hazan E, Kakade SM (2012) (weak) Calibration is computationally hard. In: Conference on learning theory (COLT) 2012
16.
Zurück zum Zitat Lehrer E (2002) Approachability in infinitely dimensional spaces. Int J Game Theory 31:255–270MathSciNetMATH Lehrer E (2002) Approachability in infinitely dimensional spaces. Int J Game Theory 31:255–270MathSciNetMATH
17.
Zurück zum Zitat Lehrer E (2012) Partially specified probabilities: decisions and games. Am Econ J Microecon 4:70–100CrossRef Lehrer E (2012) Partially specified probabilities: decisions and games. Am Econ J Microecon 4:70–100CrossRef
18.
Zurück zum Zitat Lehrer E, Solan E (2007) Learning to play partially specified equilibrium. Mimeo Lehrer E, Solan E (2007) Learning to play partially specified equilibrium. Mimeo
19.
Zurück zum Zitat Luce DR, Raiffa H (1958) Games and decisions. Wiley, NYMATH Luce DR, Raiffa H (1958) Games and decisions. Wiley, NYMATH
20.
21.
Zurück zum Zitat Mannor S, Shimkin N (2003) On-line learning with imperfect monitoring. In: Proceedings of the 16th annual conference on learning theory. Springer, Berlin, pp 552–567 Mannor S, Shimkin N (2003) On-line learning with imperfect monitoring. In: Proceedings of the 16th annual conference on learning theory. Springer, Berlin, pp 552–567
22.
Zurück zum Zitat Perchet V (2009) Calibration and internal no-regret with random signals. In: ALT2009, pp 68–82 Perchet V (2009) Calibration and internal no-regret with random signals. In: ALT2009, pp 68–82
23.
Zurück zum Zitat Perchet V (2011) Internal regret with partial monitoring calibration-based optimal algorithms. J Mach Learn Res 12:1893–1921MathSciNetMATH Perchet V (2011) Internal regret with partial monitoring calibration-based optimal algorithms. J Mach Learn Res 12:1893–1921MathSciNetMATH
25.
Zurück zum Zitat Piccolboni A, Schindelhauer C (2001) Discrete prediction games with arbitrary feedback and loss. In: COLT 2001, annual conference on computational learning theory #14, lecture notes in computer science, 2111, pp 208–223 Piccolboni A, Schindelhauer C (2001) Discrete prediction games with arbitrary feedback and loss. In: COLT 2001, annual conference on computational learning theory #14, lecture notes in computer science, 2111, pp 208–223
26.
Zurück zum Zitat Rockafeller RT, Wets RJB (2009) Variational analysis. Springer, Berlin Rockafeller RT, Wets RJB (2009) Variational analysis. Springer, Berlin
28.
Zurück zum Zitat Stoltz G, Lugosi G (2005) Internal regret in on-line portfolio selection. Mach Learn 59:125–159CrossRefMATH Stoltz G, Lugosi G (2005) Internal regret in on-line portfolio selection. Mach Learn 59:125–159CrossRefMATH
Metadaten
Titel
A General Internal Regret-Free Strategy
verfasst von
Ehud Lehrer
Eilon Solan
Publikationsdatum
01.03.2016
Verlag
Springer US
Erschienen in
Dynamic Games and Applications / Ausgabe 1/2016
Print ISSN: 2153-0785
Elektronische ISSN: 2153-0793
DOI
https://doi.org/10.1007/s13235-015-0143-5

Weitere Artikel der Ausgabe 1/2016

Dynamic Games and Applications 1/2016 Zur Ausgabe

Premium Partner