Skip to main content
Top

2014 | OriginalPaper | Chapter

6. Graphical Games: Distributed Multiplayer Games on Graphs

Authors : Frank L. Lewis, Hongwei Zhang, Kristian Hengster-Movric, Abhijit Das

Published in: Cooperative Control of Multi-Agent Systems

Publisher: Springer London

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this chapter, it is seen that distributed control protocols that both guarantee synchronization and are globally optimal for the multi-agent team always exist on any sufficiently connected communication graph if a different definition of optimality is used. To this end, we study the notion of Nash equilibrium for multiplayer games on graphs. This leads us to the idea of a new sort of differential game—graphical games. In graphical games, each agent has its own dynamics as well as its own local performance index. The dynamics and local performance indices of each agent are distributed; they depend on the state of the agent, the control of the agent, and the controls of the agent’s neighbors. We show how to compute distributed control protocols that guarantee global Nash equilibrium for multi-agent teams on any graph that has a spanning tree.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Abou-Kandil H, Freiling G, Ionescu V, Jank G (2003) Matrix Riccati Equations in Control and Systems Theory. Birkhäuser Abou-Kandil H, Freiling G, Ionescu V, Jank G (2003) Matrix Riccati Equations in Control and Systems Theory. Birkhäuser
2.
go back to reference Abu-Khalaf M, Lewis FL (2005) Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 41(5):779–791 Abu-Khalaf M, Lewis FL (2005) Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 41(5):779–791
3.
go back to reference Al-Tamimi A, Abu-Khalaf M, Lewis FL (2007) Adaptive critic designs for discrete-time zero-sum games with application to H-Infinity control. IEEE Trans Syst, Man, Cybern B 37(1):240–247 Al-Tamimi A, Abu-Khalaf M, Lewis FL (2007) Adaptive critic designs for discrete-time zero-sum games with application to H-Infinity control. IEEE Trans Syst, Man, Cybern B 37(1):240–247
4.
go back to reference Al-Tamimi A, Lewis FL, Abu-Khalaf M (2008) Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Trans Syst, Man, Cybern B 38(4):943–949 Al-Tamimi A, Lewis FL, Abu-Khalaf M (2008) Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Trans Syst, Man, Cybern B 38(4):943–949
5.
go back to reference Başar T, Olsder GJ (1999) Dynamic Noncooperative Game Theory, 2nd edn. SIAM, Philadelphia Başar T, Olsder GJ (1999) Dynamic Noncooperative Game Theory, 2nd edn. SIAM, Philadelphia
6.
go back to reference Bertsekas DP, Tsitsiklis JN (1996) Neuro-Dynamic Programming. Athena Scientific, Belmont Bertsekas DP, Tsitsiklis JN (1996) Neuro-Dynamic Programming. Athena Scientific, Belmont
7.
go back to reference Brewer JW (1978) Kronecker products and matrix calculus in system theory. IEEE Trans Circuits Syst 25:772–781 Brewer JW (1978) Kronecker products and matrix calculus in system theory. IEEE Trans Circuits Syst 25:772–781
8.
go back to reference Busoniu L, Babuska R, De Schutter B (2008) A comprehensive survey of multi-agent reinforcement learning. IEEE Trans Syst, Man, Cybern C 38(2):156–172 Busoniu L, Babuska R, De Schutter B (2008) A comprehensive survey of multi-agent reinforcement learning. IEEE Trans Syst, Man, Cybern C 38(2):156–172
9.
go back to reference Dierks T, Jagannathan S (2010) Optimal control of affine nonlinear continuous-time systems using an online Hamilton–Jacobi–Isaacs formulation. In: Proc. IEEE Conf. Decision Control, Atlanta, GA, pp. 3048–3053 Dierks T, Jagannathan S (2010) Optimal control of affine nonlinear continuous-time systems using an online Hamilton–Jacobi–Isaacs formulation. In: Proc. IEEE Conf. Decision Control, Atlanta, GA, pp. 3048–3053
10.
go back to reference Freiling G, Jank G, Abou-Kandil H (2002) On global existence of solutions to coupled matrix Riccati equations in closed loop Nash games. IEEE Trans Automat Contr 41(2):264–269 Freiling G, Jank G, Abou-Kandil H (2002) On global existence of solutions to coupled matrix Riccati equations in closed loop Nash games. IEEE Trans Automat Contr 41(2):264–269
11.
go back to reference Gajic Z, Li T-Y (1988) Simulation results for two new algorithms for solving coupled algebraic Riccati equations. Paper presented at 3rd international symposium on differential games, Sophia Antipolis, Nice, France Gajic Z, Li T-Y (1988) Simulation results for two new algorithms for solving coupled algebraic Riccati equations. Paper presented at 3rd international symposium on differential games, Sophia Antipolis, Nice, France
12.
go back to reference Goldberg AV (1995) Scaling algorithms for the shortest paths problem. SIAM J Comput 24:494–504 Goldberg AV (1995) Scaling algorithms for the shortest paths problem. SIAM J Comput 24:494–504
13.
go back to reference Ioannou P, Fidan B (2006) Adaptive Control Tutorial. SIAM, Philadelphia Ioannou P, Fidan B (2006) Adaptive Control Tutorial. SIAM, Philadelphia
14.
go back to reference Johnson M, Hiramatsu T, Fitz-Coy N, Dixon WE (2010) Asymptotic stackelberg optimal control design for an uncertain euler lagrange system. In: Proc. IEEE Conf. Decision Control, Atlanta, GA, pp. 6686–6691 Johnson M, Hiramatsu T, Fitz-Coy N, Dixon WE (2010) Asymptotic stackelberg optimal control design for an uncertain euler lagrange system. In: Proc. IEEE Conf. Decision Control, Atlanta, GA, pp. 6686–6691
15.
go back to reference Kakade S, Kearns M, Langford J, Ortiz L (2003) Correlated equilibria in graphical games. In: the 4th ACM conf. Electron. Commerce, San Diego, CA, pp. 42–47 Kakade S, Kearns M, Langford J, Ortiz L (2003) Correlated equilibria in graphical games. In: the 4th ACM conf. Electron. Commerce, San Diego, CA, pp. 42–47
16.
go back to reference Kearns M, Littman M, Singh S (2001) Graphical models for game theory. In: Proc. Annual conf. Uncertainty in Artificial Intelligence, Seattle, WA, pp. 253–260 Kearns M, Littman M, Singh S (2001) Graphical models for game theory. In: Proc. Annual conf. Uncertainty in Artificial Intelligence, Seattle, WA, pp. 253–260
17.
go back to reference Khoo S, Xie L, Man Z (2009) Robust finite-time consensus tracking algorithm for multirobot systems. IEEE Trans Mechatron 14:219–228 Khoo S, Xie L, Man Z (2009) Robust finite-time consensus tracking algorithm for multirobot systems. IEEE Trans Mechatron 14:219–228
18.
go back to reference Leake RJ, Liu R-W (1967) Construction of suboptimal control sequences. SIAM J Contr 5(1):54–63 Leake RJ, Liu R-W (1967) Construction of suboptimal control sequences. SIAM J Contr 5(1):54–63
19.
go back to reference Lewis FL (1992) Applied Optimal Control and Estimation: Digital Design and Implementation. Prentice-Hall, Upper Saddle River Lewis FL (1992) Applied Optimal Control and Estimation: Digital Design and Implementation. Prentice-Hall, Upper Saddle River
20.
go back to reference Lewis FL, Vrabie D (2009) Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits & Systems Magazine (invited feature article), pp. 32–50, Third Quarter 2009 Lewis FL, Vrabie D (2009) Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits & Systems Magazine (invited feature article), pp. 32–50, Third Quarter 2009
21.
go back to reference Lewis FL, Jagannathan S, Yesildirek A (1999) Neural Network Control of Robot Manipulators and Nonlinear Systems. Taylor and Francis, London Lewis FL, Jagannathan S, Yesildirek A (1999) Neural Network Control of Robot Manipulators and Nonlinear Systems. Taylor and Francis, London
22.
go back to reference Lewis FL, Vrabie D, Syrmos VL (2012) Optimal control, 3rd edn. Wiley, Hoboken Lewis FL, Vrabie D, Syrmos VL (2012) Optimal control, 3rd edn. Wiley, Hoboken
23.
go back to reference Lewis FL, Vrabie D, Vamvoudakis KG (2012) Reinforcement learning and feedback control. IEEE Control Systems Magazine, pp. 76–105 Lewis FL, Vrabie D, Vamvoudakis KG (2012) Reinforcement learning and feedback control. IEEE Control Systems Magazine, pp. 76–105
24.
go back to reference Li X, Wang X, Chen G (2004) Pinning a complex dynamical network to its equilibrium. IEEE Trans Circuits Syst I, Reg Papers 51(10):2074–2087 Li X, Wang X, Chen G (2004) Pinning a complex dynamical network to its equilibrium. IEEE Trans Circuits Syst I, Reg Papers 51(10):2074–2087
25.
go back to reference Littman ML (2001) Value-function reinforcement learning in Markov games. J Cogn Syst Res 2(1):55–66 Littman ML (2001) Value-function reinforcement learning in Markov games. J Cogn Syst Res 2(1):55–66
26.
go back to reference Marden JR, Young HP, Pao LY (2012) Achieving pareto optimality through distributed learning. In: Proc. IEEE Conf. Decision Control, Maui, HI, pp. 7419–7424 Marden JR, Young HP, Pao LY (2012) Achieving pareto optimality through distributed learning. In: Proc. IEEE Conf. Decision Control, Maui, HI, pp. 7419–7424
28.
go back to reference Shinohara R (2010) Coalition proof equilibria in a voluntary participation game. Int J Game Theory 39(4):603–615 Shinohara R (2010) Coalition proof equilibria in a voluntary participation game. Int J Game Theory 39(4):603–615
29.
go back to reference Shoham Y, Leyton-Brown K (2009). Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations. Cambridge University Press, Cambridge Shoham Y, Leyton-Brown K (2009). Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations. Cambridge University Press, Cambridge
30.
go back to reference Sutton RS, Barto AG (1998) Reinforcement learning—an introduction. MIT Press, Cambridge Sutton RS, Barto AG (1998) Reinforcement learning—an introduction. MIT Press, Cambridge
31.
go back to reference Tijs S (2003) Introduction to game theory. Hindustan Book Agency, New Delhi. Tijs S (2003) Introduction to game theory. Hindustan Book Agency, New Delhi.
32.
go back to reference Vamvoudakis KG, Lewis FL (2010). Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica 46(5):878–888 Vamvoudakis KG, Lewis FL (2010). Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica 46(5):878–888
33.
go back to reference Vamvoudakis KG, Lewis FL (2011). Multi-player non-zero sum games: online adaptive learning solution of coupled Hamilton–Jacobi equations. Automatica 47(8):1556–1569 Vamvoudakis KG, Lewis FL (2011). Multi-player non-zero sum games: online adaptive learning solution of coupled Hamilton–Jacobi equations. Automatica 47(8):1556–1569
34.
go back to reference Vamvoudakis KG, Lewis FL, Hudas GR (2012) Multi-agent differential graphical games: online adaptive learning solution for synchronization with optimality. Automatica 48(8):1598–1611 Vamvoudakis KG, Lewis FL, Hudas GR (2012) Multi-agent differential graphical games: online adaptive learning solution for synchronization with optimality. Automatica 48(8):1598–1611
35.
go back to reference Vrabie D, Lewis FL (2009) Neural network approach to continuous-time direct adaptive optimal control for partially-unknown nonlinear systems. Neural Networks 2(3):237–246 Vrabie D, Lewis FL (2009) Neural network approach to continuous-time direct adaptive optimal control for partially-unknown nonlinear systems. Neural Networks 2(3):237–246
36.
go back to reference Vrabie D, Pastravanu O, Lewis FL, Abu-Khalaf M (2009). Adaptive optimal control for continuous-time linear systems based on policy iteration. Automatica 45(2):477–484 Vrabie D, Pastravanu O, Lewis FL, Abu-Khalaf M (2009). Adaptive optimal control for continuous-time linear systems based on policy iteration. Automatica 45(2):477–484
37.
go back to reference Vrancx P, Verbeeck K, Nowe A (2008). Decentralized learning in Markov games. IEEE Tran Syst Man Cyber 38(4):976–981 Vrancx P, Verbeeck K, Nowe A (2008). Decentralized learning in Markov games. IEEE Tran Syst Man Cyber 38(4):976–981
38.
go back to reference Wang F, Zhang H, Liu D (May 2009) Adaptive dynamic programming: an introduction. IEEE Comput Intell Mag 4(2):39–47 Wang F, Zhang H, Liu D (May 2009) Adaptive dynamic programming: an introduction. IEEE Comput Intell Mag 4(2):39–47
39.
go back to reference Wang X, Chen G (2002). Pinning control of scale-free dynamical networks. Physica A 310(3–4):521–531 Wang X, Chen G (2002). Pinning control of scale-free dynamical networks. Physica A 310(3–4):521–531
40.
go back to reference Werbos PJ (1974) Beyond Regression: New Tools for Prediction and Analysis in the Behavior Sciences. Ph.D. Thesis, Harvard University Werbos PJ (1974) Beyond Regression: New Tools for Prediction and Analysis in the Behavior Sciences. Ph.D. Thesis, Harvard University
41.
go back to reference Werbos PJ (1992) Approximate dynamic programming for real-time control and neural modeling. In: White DA, Sofge DA (eds) Handbook of Intelligent Control. Van Nostrand Reinhold, New York Werbos PJ (1992) Approximate dynamic programming for real-time control and neural modeling. In: White DA, Sofge DA (eds) Handbook of Intelligent Control. Van Nostrand Reinhold, New York
42.
go back to reference Zwick U (2002) All pairs shortest paths using bridging sets and rectangular matrix multiplication. J ACM 49(3):289-317. Zwick U (2002) All pairs shortest paths using bridging sets and rectangular matrix multiplication. J ACM 49(3):289-317.
Metadata
Title
Graphical Games: Distributed Multiplayer Games on Graphs
Authors
Frank L. Lewis
Hongwei Zhang
Kristian Hengster-Movric
Abhijit Das
Copyright Year
2014
Publisher
Springer London
DOI
https://doi.org/10.1007/978-1-4471-5574-4_6