Skip to main content

2018 | OriginalPaper | Buchkapitel

5. Differential Graphical Games

verfasst von : Rushikesh Kamalapurkar, Patrick Walters, Joel Rosenfeld, Warren Dixon

Erschienen in: Reinforcement Learning for Optimal Feedback Control

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This chapter deals with the formulation and online approximate feedback-Nash equilibrium solution of an optimal network formation tracking problem. A relative control error minimization technique is introduced to facilitate the formulation of a feasible infinite-horizon total-cost differential graphical game. A dynamic programming-based feedback-Nash equilibrium solution to the differential graphical game is obtained via the development of a set of coupled Hamilton–Jacobi equations. The developed approximate feedback-Nash equilibrium solution is analyzed using a Lyapunov-based stability analysis to demonstrate ultimately bounded formation tracking in the presence of uncertainties. In addition to control, this chapter also explores applications of differential graphical games to monitoring the behavior of neighboring agents in a network.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Parts of the text in this section are reproduced, with permission, from [1], ©2016, IEEE.
 
2
Parts of the text in this section are reproduced, with permission, from [8], ©2015, IEEE.
 
Literatur
1.
Zurück zum Zitat Kamalapurkar R, Klotz JR, Walters P, Dixon WE (2018) Model-based reinforcement learning for differential graphical games. IEEE Trans Control Netw Syst 5:423–433CrossRef Kamalapurkar R, Klotz JR, Walters P, Dixon WE (2018) Model-based reinforcement learning for differential graphical games. IEEE Trans Control Netw Syst 5:423–433CrossRef
4.
5.
Zurück zum Zitat Friedman A (1971) Differential games. Wiley Friedman A (1971) Differential games. Wiley
6.
Zurück zum Zitat Bressan A, Priuli FS (2006) Infinite horizon noncooperative differential games. J Differ Equ 227(1):230–257MathSciNetCrossRef Bressan A, Priuli FS (2006) Infinite horizon noncooperative differential games. J Differ Equ 227(1):230–257MathSciNetCrossRef
7.
Zurück zum Zitat Bressan A (2011) Noncooperative differential games. Milan J Math 79(2):357–427CrossRef Bressan A (2011) Noncooperative differential games. Milan J Math 79(2):357–427CrossRef
8.
Zurück zum Zitat Klotz J, Andrews L, Kamalapurkar R, Dixon WE (2015) Decentralized monitoring of leader-follower networks of uncertain nonlinear systems. In: Proceedings of the American control conference, pp 1393–1398 Klotz J, Andrews L, Kamalapurkar R, Dixon WE (2015) Decentralized monitoring of leader-follower networks of uncertain nonlinear systems. In: Proceedings of the American control conference, pp 1393–1398
9.
Zurück zum Zitat Khoo S, Xie L (2009) Robust finite-time consensus tracking algorithm for multirobot systems. IEEE/ASME Trans Mechatron 14(2):219–228CrossRef Khoo S, Xie L (2009) Robust finite-time consensus tracking algorithm for multirobot systems. IEEE/ASME Trans Mechatron 14(2):219–228CrossRef
10.
Zurück zum Zitat Liberzon D (2012) Calculus of variations and optimal control theory: a concise introduction. Princeton University Press Liberzon D (2012) Calculus of variations and optimal control theory: a concise introduction. Princeton University Press
11.
Zurück zum Zitat Kamalapurkar R, Andrews L, Walters P, Dixon WE (2017) Model-based reinforcement learning for infinite-horizon approximate optimal tracking. IEEE Trans Neural Netw Learn Syst 28(3):753–758CrossRef Kamalapurkar R, Andrews L, Walters P, Dixon WE (2017) Model-based reinforcement learning for infinite-horizon approximate optimal tracking. IEEE Trans Neural Netw Learn Syst 28(3):753–758CrossRef
12.
Zurück zum Zitat Chowdhary GV, Johnson EN (2011) Theory and flight-test validation of a concurrent-learning adaptive controller. J Guid Control Dyn 34(2):592–607CrossRef Chowdhary GV, Johnson EN (2011) Theory and flight-test validation of a concurrent-learning adaptive controller. J Guid Control Dyn 34(2):592–607CrossRef
13.
Zurück zum Zitat Kamalapurkar R, Walters P, Dixon WE (2016) Model-based reinforcement learning for approximate optimal regulation. Automatica 64:94–104MathSciNetCrossRef Kamalapurkar R, Walters P, Dixon WE (2016) Model-based reinforcement learning for approximate optimal regulation. Automatica 64:94–104MathSciNetCrossRef
14.
Zurück zum Zitat Bell Z, Parikh A, Nezvadovitz J, Dixon WE (2016) Adaptive control of a surface marine craft with parameter identification using integral concurrent learning. In: Proceedings of the IEEE conference on decision and control, pp 389–394 Bell Z, Parikh A, Nezvadovitz J, Dixon WE (2016) Adaptive control of a surface marine craft with parameter identification using integral concurrent learning. In: Proceedings of the IEEE conference on decision and control, pp 389–394
15.
Zurück zum Zitat Vamvoudakis KG, Lewis FL (2011) Multi-player non-zero-sum games: online adaptive learning solution of coupled hamilton-jacobi equations. Automatica 47:1556–1569MathSciNetCrossRef Vamvoudakis KG, Lewis FL (2011) Multi-player non-zero-sum games: online adaptive learning solution of coupled hamilton-jacobi equations. Automatica 47:1556–1569MathSciNetCrossRef
16.
Zurück zum Zitat Johnson M, Bhasin S, Dixon WE (2011) Nonlinear two-player zero-sum game approximate solution using a policy iteration algorithm. In: Proceedings of the IEEE conference on decision and control, pp 142–147 Johnson M, Bhasin S, Dixon WE (2011) Nonlinear two-player zero-sum game approximate solution using a policy iteration algorithm. In: Proceedings of the IEEE conference on decision and control, pp 142–147
17.
Zurück zum Zitat Vamvoudakis KG, Lewis FL, Hudas GR (2012) Multi-agent differential graphical games: online adaptive learning solution for synchronization with optimality. Automatica 48(8):1598–1611MathSciNetCrossRef Vamvoudakis KG, Lewis FL, Hudas GR (2012) Multi-agent differential graphical games: online adaptive learning solution for synchronization with optimality. Automatica 48(8):1598–1611MathSciNetCrossRef
18.
Zurück zum Zitat Kamalapurkar R, Dinh HT, Walters P, Dixon WE (2013) Approximate optimal cooperative decentralized control for consensus in a topological network of agents with uncertain nonlinear dynamics. In: Proceedings of the American control conference, Washington, DC, pp 1322–1327 Kamalapurkar R, Dinh HT, Walters P, Dixon WE (2013) Approximate optimal cooperative decentralized control for consensus in a topological network of agents with uncertain nonlinear dynamics. In: Proceedings of the American control conference, Washington, DC, pp 1322–1327
19.
Zurück zum Zitat Modares H, Lewis FL, Naghibi-Sistani MB (2014) Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems. Automatica 50(1):193–202MathSciNetCrossRef Modares H, Lewis FL, Naghibi-Sistani MB (2014) Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems. Automatica 50(1):193–202MathSciNetCrossRef
20.
Zurück zum Zitat Kamalapurkar R, Dinh H, Bhasin S, Dixon WE (2015) Approximate optimal trajectory tracking for continuous-time nonlinear systems. Automatica 51:40–48MathSciNetCrossRef Kamalapurkar R, Dinh H, Bhasin S, Dixon WE (2015) Approximate optimal trajectory tracking for continuous-time nonlinear systems. Automatica 51:40–48MathSciNetCrossRef
21.
Zurück zum Zitat Khalil HK (2002) Nonlinear Systems, 3rd edn. Prentice Hall, Upper Saddle River, NJMATH Khalil HK (2002) Nonlinear Systems, 3rd edn. Prentice Hall, Upper Saddle River, NJMATH
22.
Zurück zum Zitat Chowdhary G, Yucelen T, Mühlegg M, Johnson EN (2013) Concurrent learning adaptive control of linear systems with exponentially convergent bounds. Int J Adapt Control Signal Process 27(4):280–301MathSciNetCrossRef Chowdhary G, Yucelen T, Mühlegg M, Johnson EN (2013) Concurrent learning adaptive control of linear systems with exponentially convergent bounds. Int J Adapt Control Signal Process 27(4):280–301MathSciNetCrossRef
23.
Zurück zum Zitat Savitzky A, Golay MJE (1964) Smoothing and differentiation of data by simplified least squares procedures. Anal Chem 36(8):1627–1639CrossRef Savitzky A, Golay MJE (1964) Smoothing and differentiation of data by simplified least squares procedures. Anal Chem 36(8):1627–1639CrossRef
24.
Zurück zum Zitat Krstic M, Li ZH (1998) Inverse optimal design of input-to-state stabilizing nonlinear controllers. IEEE Trans Autom Control 43(3):336–350MathSciNetCrossRef Krstic M, Li ZH (1998) Inverse optimal design of input-to-state stabilizing nonlinear controllers. IEEE Trans Autom Control 43(3):336–350MathSciNetCrossRef
25.
Zurück zum Zitat Mombaur K, Truong A, Laumond JP (2010) From human to humanoid locomotion - an inverse optimal control approach. Auton Robots 28(3):369–383CrossRef Mombaur K, Truong A, Laumond JP (2010) From human to humanoid locomotion - an inverse optimal control approach. Auton Robots 28(3):369–383CrossRef
26.
Zurück zum Zitat Ratliff ND, Bagnell JA, Zinkevich MA (2006) Maximum margin planning. In: Proceedings of the international conference on learning Ratliff ND, Bagnell JA, Zinkevich MA (2006) Maximum margin planning. In: Proceedings of the international conference on learning
27.
Zurück zum Zitat Pang Z, Liu G (2012) Design and implementation of secure networked predictive control systems under deception attacks. IEEE Trans Control Syst Technol 20(5):1334–1342CrossRef Pang Z, Liu G (2012) Design and implementation of secure networked predictive control systems under deception attacks. IEEE Trans Control Syst Technol 20(5):1334–1342CrossRef
28.
Zurück zum Zitat Clark A, Zhu Q, Poovendran R, Başar T (2013) An impact-aware defense against stuxnet. In: Proceedings of the American control conference, pp 4146–4153 Clark A, Zhu Q, Poovendran R, Başar T (2013) An impact-aware defense against stuxnet. In: Proceedings of the American control conference, pp 4146–4153
29.
Zurück zum Zitat Kamalapurkar R, Walters P, Dixon WE (2013) Concurrent learning-based approximate optimal regulation. In: Proceedings of the IEEE conference on decision and control, Florence, IT, pp 6256–6261 Kamalapurkar R, Walters P, Dixon WE (2013) Concurrent learning-based approximate optimal regulation. In: Proceedings of the IEEE conference on decision and control, Florence, IT, pp 6256–6261
30.
Zurück zum Zitat Bhasin S, Kamalapurkar R, Johnson M, Vamvoudakis KG, Lewis FL, Dixon WE (2013) A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems. Automatica 49(1):89–92MathSciNetCrossRef Bhasin S, Kamalapurkar R, Johnson M, Vamvoudakis KG, Lewis FL, Dixon WE (2013) A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems. Automatica 49(1):89–92MathSciNetCrossRef
31.
Zurück zum Zitat Vamvoudakis KG, Lewis FL (2009) Online synchronous policy iteration method for optimal control. In: Yu W (ed) Recent advances in intelligent control systems, Springer, pp 357–374CrossRef Vamvoudakis KG, Lewis FL (2009) Online synchronous policy iteration method for optimal control. In: Yu W (ed) Recent advances in intelligent control systems, Springer, pp 357–374CrossRef
32.
Zurück zum Zitat Ioannou P, Sun J (1996) Robust adaptive control. Prentice Hall Ioannou P, Sun J (1996) Robust adaptive control. Prentice Hall
33.
Zurück zum Zitat Johnson M, Hiramatsu T, Fitz-Coy N, Dixon WE (2010) Asymptotic stackelberg optimal control design for an uncertain Euler-Lagrange system. In: Proceedings of the IEEE conference on decision and control, Atlanta, GA, pp 6686–6691 Johnson M, Hiramatsu T, Fitz-Coy N, Dixon WE (2010) Asymptotic stackelberg optimal control design for an uncertain Euler-Lagrange system. In: Proceedings of the IEEE conference on decision and control, Atlanta, GA, pp 6686–6691
34.
Zurück zum Zitat Vamvoudakis KG, Lewis FL (2010) Online neural network solution of nonlinear two-player zero-sum games using synchronous policy iteration. In: Proceedings of the IEEE conference on decision and control Vamvoudakis KG, Lewis FL (2010) Online neural network solution of nonlinear two-player zero-sum games using synchronous policy iteration. In: Proceedings of the IEEE conference on decision and control
35.
Zurück zum Zitat Vrabie D, Lewis FL (2010) Integral reinforcement learning for online computation of feedback nash strategies of nonzero-sum differential games. In: Proceedings of the IEEE conference on decision and control, pp 3066–3071 Vrabie D, Lewis FL (2010) Integral reinforcement learning for online computation of feedback nash strategies of nonzero-sum differential games. In: Proceedings of the IEEE conference on decision and control, pp 3066–3071
36.
Zurück zum Zitat Lewis M, Tan K (1997) High precision formation control of mobile robots using virtual structures. Auton Robots 4(4):387–403CrossRef Lewis M, Tan K (1997) High precision formation control of mobile robots using virtual structures. Auton Robots 4(4):387–403CrossRef
37.
Zurück zum Zitat Balch T, Arkin R (1998) Behavior-based formation control for multirobot teams. IEEE Trans Robot Autom 14(6):926–939CrossRef Balch T, Arkin R (1998) Behavior-based formation control for multirobot teams. IEEE Trans Robot Autom 14(6):926–939CrossRef
38.
Zurück zum Zitat Das A, Fierro R, Kumar V, Ostrowski J, Spletzer J, Taylor C (2002) A vision-based formation control framework. IEEE Trans Robot Autom 18(5):813–825CrossRef Das A, Fierro R, Kumar V, Ostrowski J, Spletzer J, Taylor C (2002) A vision-based formation control framework. IEEE Trans Robot Autom 18(5):813–825CrossRef
39.
Zurück zum Zitat Fax J, Murray R (2004) Information flow and cooperative control of vehicle formations. IEEE Trans Autom Control 49(9):1465–1476MathSciNetCrossRef Fax J, Murray R (2004) Information flow and cooperative control of vehicle formations. IEEE Trans Autom Control 49(9):1465–1476MathSciNetCrossRef
40.
Zurück zum Zitat Murray R (2007) Recent research in cooperative control of multivehicle systems. J Dyn Syst Meas Control 129:571–583CrossRef Murray R (2007) Recent research in cooperative control of multivehicle systems. J Dyn Syst Meas Control 129:571–583CrossRef
41.
Zurück zum Zitat Wang J, Xin M (2010) Multi-agent consensus algorithm with obstacle avoidance via optimal control approach. Int J Control 83(12):2606–2621MathSciNetCrossRef Wang J, Xin M (2010) Multi-agent consensus algorithm with obstacle avoidance via optimal control approach. Int J Control 83(12):2606–2621MathSciNetCrossRef
42.
Zurück zum Zitat Wang J, Xin M (2012) Distributed optimal cooperative tracking control of multiple autonomous robots. Robot Auton Syst 60(4):572–583CrossRef Wang J, Xin M (2012) Distributed optimal cooperative tracking control of multiple autonomous robots. Robot Auton Syst 60(4):572–583CrossRef
43.
Zurück zum Zitat Wang J, Xin M (2013) Integrated optimal formation control of multiple unmanned aerial vehicles. IEEE Trans Control Syst Technol 21(5):1731–1744CrossRef Wang J, Xin M (2013) Integrated optimal formation control of multiple unmanned aerial vehicles. IEEE Trans Control Syst Technol 21(5):1731–1744CrossRef
44.
Zurück zum Zitat Lin W (2014) Distributed uav formation control using differential game approach. Aerosp Sci Technol 35:54–62CrossRef Lin W (2014) Distributed uav formation control using differential game approach. Aerosp Sci Technol 35:54–62CrossRef
45.
Zurück zum Zitat Semsar-Kazerooni E, Khorasani K (2008) Optimal consensus algorithms for cooperative team of agents subject to partial information. Automatica 44(11):2766–2777MathSciNetCrossRef Semsar-Kazerooni E, Khorasani K (2008) Optimal consensus algorithms for cooperative team of agents subject to partial information. Automatica 44(11):2766–2777MathSciNetCrossRef
46.
Zurück zum Zitat Shim DH, Kim HJ, Sastry S (2003) Decentralized nonlinear model predictive control of multiple flying robots. Proceedings of the IEEE conference on decision and control 4:3621–3626 Shim DH, Kim HJ, Sastry S (2003) Decentralized nonlinear model predictive control of multiple flying robots. Proceedings of the IEEE conference on decision and control 4:3621–3626
47.
Zurück zum Zitat Magni L, Scattolini R (2006) Stabilizing decentralized model predictive control of nonlinear systems. Automatica 42(7):1231–1236MathSciNetCrossRef Magni L, Scattolini R (2006) Stabilizing decentralized model predictive control of nonlinear systems. Automatica 42(7):1231–1236MathSciNetCrossRef
48.
Zurück zum Zitat Heydari A, Balakrishnan SN (2012) An optimal tracking approach to formation control of nonlinear multi-agent systems. In: Proceedings of AIAA guidance, navigation and control conference Heydari A, Balakrishnan SN (2012) An optimal tracking approach to formation control of nonlinear multi-agent systems. In: Proceedings of AIAA guidance, navigation and control conference
49.
Zurück zum Zitat Zhang H, Zhang J, Yang GH, Luo Y (2015) Leader-based optimal coordination control for the consensus problem of multiagent differential games via fuzzy adaptive dynamic programming. IEEE Trans Fuzzy Syst 23(1):152–163CrossRef Zhang H, Zhang J, Yang GH, Luo Y (2015) Leader-based optimal coordination control for the consensus problem of multiagent differential games via fuzzy adaptive dynamic programming. IEEE Trans Fuzzy Syst 23(1):152–163CrossRef
50.
Zurück zum Zitat Sundaram S, Revzen S, Pappas G (2012) A control-theoretic approach to disseminating values and overcoming malicious links in wireless networks. Automatica 48(11):2894–2901MathSciNetCrossRef Sundaram S, Revzen S, Pappas G (2012) A control-theoretic approach to disseminating values and overcoming malicious links in wireless networks. Automatica 48(11):2894–2901MathSciNetCrossRef
51.
Zurück zum Zitat Abbas W, Egerstedt M (2012) Securing multiagent systems against a sequence of intruder attacks. In: Proceedings of the American control conference, pp 4161–4166 Abbas W, Egerstedt M (2012) Securing multiagent systems against a sequence of intruder attacks. In: Proceedings of the American control conference, pp 4161–4166
52.
Zurück zum Zitat Fawzi H, Tabuada P, Diggavi S (2012) Security for control systems under sensor and actuator attacks. In: Proceedings of the IEEE conference on decision and control, pp 3412–3417 Fawzi H, Tabuada P, Diggavi S (2012) Security for control systems under sensor and actuator attacks. In: Proceedings of the IEEE conference on decision and control, pp 3412–3417
53.
Zurück zum Zitat Jung D, Selmic RR (2008) Power leader fault detection in nonlinear leader-follower networks. In: Proceedings of the IEEE conference on decision and control, pp 404–409 Jung D, Selmic RR (2008) Power leader fault detection in nonlinear leader-follower networks. In: Proceedings of the IEEE conference on decision and control, pp 404–409
54.
Zurück zum Zitat Zhang X (2010) Decentralized fault detection for a class of large-scale nonlinear uncertain systems. In: Proceedings of the American control conference, pp 5650–5655 Zhang X (2010) Decentralized fault detection for a class of large-scale nonlinear uncertain systems. In: Proceedings of the American control conference, pp 5650–5655
55.
Zurück zum Zitat Li X, Zhou K (2009) A time domain approach to robust fault detection of linear time-varying systems. Automatica 45(1):94–102MathSciNetCrossRef Li X, Zhou K (2009) A time domain approach to robust fault detection of linear time-varying systems. Automatica 45(1):94–102MathSciNetCrossRef
56.
Zurück zum Zitat Potula K, Selmic RR, Polycarpou MM (2010) Dynamic leader-followers network model of human emotions and their fault detection. In: Proceedings of the IEEE conference on decision and control, pp 744–749 Potula K, Selmic RR, Polycarpou MM (2010) Dynamic leader-followers network model of human emotions and their fault detection. In: Proceedings of the IEEE conference on decision and control, pp 744–749
57.
Zurück zum Zitat Ferdowsi H, Raja DL, Jagannathan S (2012) A decentralized fault prognosis scheme for nonlinear interconnected discrete-time systems. In: Proceedings of the American control conference, pp 5900–5905 Ferdowsi H, Raja DL, Jagannathan S (2012) A decentralized fault prognosis scheme for nonlinear interconnected discrete-time systems. In: Proceedings of the American control conference, pp 5900–5905
58.
Zurück zum Zitat Luo X, Dong M, Huang Y (2006) On distributed fault-tolerant detection in wireless sensor networks. IEEE Trans Comput 55(1):58–70CrossRef Luo X, Dong M, Huang Y (2006) On distributed fault-tolerant detection in wireless sensor networks. IEEE Trans Comput 55(1):58–70CrossRef
59.
Zurück zum Zitat Fernández-Bes J, Cid-Sueiro J (2012) Decentralized detection with energy-aware greedy selective sensors. In: International workshop on cognitive information process, pp 1–6 Fernández-Bes J, Cid-Sueiro J (2012) Decentralized detection with energy-aware greedy selective sensors. In: International workshop on cognitive information process, pp 1–6
60.
Zurück zum Zitat Jiao Q, Modares H, Xu S, Lewis FL, Vamvoudakis KG (2016) Multi-agent zero-sum differential graphical games for disturbance rejection in distributed control. Automatica 69:24–34MathSciNetCrossRef Jiao Q, Modares H, Xu S, Lewis FL, Vamvoudakis KG (2016) Multi-agent zero-sum differential graphical games for disturbance rejection in distributed control. Automatica 69:24–34MathSciNetCrossRef
61.
Zurück zum Zitat Li J, Modares H, Chai T, Lewis FL, Xie L (2017) Off-policy reinforcement learning for synchronization in multiagent graphical games. IEEE Trans Neural Netw Learn Syst 28(10):2434–2445MathSciNetCrossRef Li J, Modares H, Chai T, Lewis FL, Xie L (2017) Off-policy reinforcement learning for synchronization in multiagent graphical games. IEEE Trans Neural Netw Learn Syst 28(10):2434–2445MathSciNetCrossRef
62.
Zurück zum Zitat Vamvoudakis KG (2017) Q-learning for continuous-time graphical games on large networks with completely unknown linear system dynamics. Int J Robust Nonlinear Control 27(16):2900–2920MathSciNetCrossRef Vamvoudakis KG (2017) Q-learning for continuous-time graphical games on large networks with completely unknown linear system dynamics. Int J Robust Nonlinear Control 27(16):2900–2920MathSciNetCrossRef
63.
Zurück zum Zitat Vamvoudakis KG, Modares H, Kiumarsi B, Lewis FL (2017) Game theory-based control system algorithms with real-time reinforcement learning: How to solve multiplayer games online. IEEE Control Syst 37(1):33–52MathSciNetCrossRef Vamvoudakis KG, Modares H, Kiumarsi B, Lewis FL (2017) Game theory-based control system algorithms with real-time reinforcement learning: How to solve multiplayer games online. IEEE Control Syst 37(1):33–52MathSciNetCrossRef
Metadaten
Titel
Differential Graphical Games
verfasst von
Rushikesh Kamalapurkar
Patrick Walters
Joel Rosenfeld
Warren Dixon
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-78384-0_5

Neuer Inhalt