nach oben

Neural Processing Letters

Erschienen in:

19.10.2021

Model-Free Optimal Consensus Control for Multi-agent Systems Based on DHP Algorithm

verfasst von: Haoen Shi, Yanghe Feng, Chaoxu Mu, Yunkai Wu

Erschienen in: Neural Processing Letters | Ausgabe 1/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

This paper developes a novel model-free dual heuristic dynamic programming (DHP) algorithm combined with policy iteration and least square techniques to implement optimal consensus control of discrete-time multi-agent systems. The coupled Hamilton-Jacobi-Bellman (HJB) equations are required to be solved to achieve optimal consensus control, which is generally difficult especially under the case of unknown mathematical models. To overcome above difficulties, the DHP method is carried out by reinforcement learning utilizing online collected data rather than the accurate system dynamics. First, the performance index and corresponding Bellman equation are acquired. Each agent’s value function has quadratic form. Then, a model network is employed to approximate the accurate system dynamics. The Q-function Bellman equation is obtained next. By taking the derivative of Q-function, the DHP method is applied to construct the update formula. Convergence and stability analysis of proposed algorithm are presented. Two simulation examples are provided to illustrate the validity of the proposed algorithm.

Vorheriger Artikel Laplacian Generalized Eigenvalues Extreme Learning Machine

Nächster Artikel Tweet Retweet Prediction Based on Deep Multitask Learning

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Dong XW, Zhou Y, Zhang R, Zhong YS (2016) Time-varying formation control for unmanned aerial vehicles with switching interaction topologies. Control Eng Pract 46:26–36CrossRef

Ge XH, Han QL, Zhang XM (2018) Achieving cluster formation of multi-agent systems under aperiodic sampling and communication delays. IEEE Trans Ind Electron 65(4):3417–3426CrossRef

Su HS, Zhang NZ, Chen MZQ, Wang HW, Wang XF (2013) Adaptive flocking with a virtual leader of multiple agents governed by locally Lipschitz nonlinearity. Nonlinear Anal Real World Appl 14(1):310–325MathSciNetCrossRef

Ding L, Han QL, Ge XH, Zhang XM (2018) An overview of recent advances in event-triggered consensus of multiagent systems. IEEE Trans Cybern 48(4):1110–1123CrossRef

Lin J, Morse AS, Anderson BDO (2004) The multi-agent rendezvous problem—the asynchronous case. In: 43rd IEEE conference on decision and control, pp 1926–1931

Olfati-Saber R, Murray RM (2004) Consensus problems in networks of agents with switching topology and time-delays. IEEE Trans Autom Control 49(9):1520–1533MathSciNetCrossRef

Cao YC, Yu WW, Ren W, Chen GR (2013) An overview of recent progress in the study of distributed multi-agent coordination. IEEE Trans Ind Inf 9(1):427–438CrossRef

Abouheaf MI, Lewis FL, Vamvoudakis KG, Haesaert S, Babuska R (2014) Multi-agent discrete-time graphical games and reinforcement learning solutions. Automatica 50(12):3038–3053MathSciNetCrossRef

Lewis FL, Vrabie D (2009) Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits Syst Mag 9(3):32–50CrossRef

10.

Zhang HG, Luo YH, Liu DR (2009) Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints. IEEE Trans Neural Netw 20(9):1490–1503CrossRef

11.

Modares H, Lewis FL (2014) Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning. Automatica 50(7):1780–1792MathSciNetCrossRef

12.

Abu-Khalaf M, Lewis FL (2008) Neuro dynamic programming and zero-sum games for constrained control systems. IEEE Trans Neural Netw 19(7):1243–1252CrossRef

13.

Shi J, Yue D, Xie XP, Karimpour A, Naghibi-Sistani MB (2020) Adaptive optimal tracking control for nonlinear continuous-time systems with time delay using value iteration algorithm. Neurocomputing 396:172–178CrossRef

14.

Wei QL, Zhang HG, Liu DR (2010) An optimal control scheme for a class of discrete-time nonlinear systems with time delays using adaptive dynamic programming. Acta Autom Sin 36(1):121–129MathSciNetCrossRef

15.

Kiumarsi B, Lewis FL, Naghibi-Sistani MB, Karimpour A (2015) Optimal tracking control of unknown discrete-time linear systems using input–output measured data. IEEE Trans Cybern 45(12):2770–2779CrossRef

16.

Mu CX, Zhao Q, Sun CY, Gao ZK (2019) An ADDHP-based Q-learning algorithm for optimal tracking control of linear discrete-time systems with unknown dynamics. Appl Soft Comput 82:1–13CrossRef

17.

Kiumarsi B, Lewis FL, Modares H, Karimpour A, Naghibi-Sistani MB (2014) Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics. Automatica 50(4):1167–1175MathSciNetCrossRef

18.

Wei QL, Song RZ, Yan PF (2016) Data-driven zero-sum neuro-optimal control for a class of continuous-time unknown nonlinear systems with disturbance using ADP. IEEE Trans Neural Netw 27(2):444–458MathSciNetCrossRef

19.

Vamvoudakis K, Lewis FL (2012) Online solution of nonlinear two-player zero-sum games using synchronous policy iteration. Int J Robust Nonlinear Control 22(13):1460–1483MathSciNetCrossRef

20.

Wen YL, Zhang HG, Su HG, Ren H (2020) Optimal tracking control for non-zero-sum games of linear discrete-time systems via off-policy reinforcement learning. Opt Control Appl Methods 41(4):1233–1250MathSciNetCrossRef

21.

Zhang HG, Cui LL, Luo YH (2013) Near-optimal control for nonzero-sum differential games of continuous-time nonlinear systems using single network ADP. IEEE Trans Cybern 43(1):206–216CrossRef

22.

Mu CX, Sun CY, Song AG, Yu HL (2016) Iterative GDHP-based approxiamte optimal tracking control for a class of discrete-time nonlinear systems. Neurocomputing 214:775–784CrossRef

23.

Zhang HW, Lewis FL (2012) Adaptive cooperative tracking control of higher-order nonlinear systems with unknown dynamics. Automatica 48(7):1432–1439MathSciNetCrossRef

24.

Zhang K, Zhang HG, Gao ZY, Su HG (2018) Online adaptive policy iteration based fault-tolerant control algorithm for continuous-time nonlinear tracking systems with actuator failures. J Frankl Inst 355(15):6947–6968MathSciNetCrossRef

25.

Li MH, Gao X, Wen Y, Si J, Huang H (2019) Offline policy iteration based reinforcement learning controller for online robotic knee prosthesis parameter tuning. In: 2019 International conference on robotics and automation (ICRA), pp 2831–2837

26.

Vamvoudakis K, Lewis FL, Hudas G (2012) Multi-agent differential graphical games: online adaptive learning solution for synchronization with optimality. Automatica 48(8):1598–1611MathSciNetCrossRef

27.

Abouheaf M, Lewis FL (2013) Multi-agent differential graphical games: Nash online adaptive learning solutions. In: 52nd IEEE annual conference on decision and control (CDC), pp 5803–5809

28.

Zhang HG, Zhang JL, Yang GH, Luo YH (2015) Leader-based optimal coordination control for the consensus problem of multiagent differential games via fuzzy adaptive dynamic programming. IEEE Trans Fuzzy Syst 23(1):152–163CrossRef

29.

Wei QL, Liu DR, Lewis FL (2015) Optimal distributed synchronization control for continuous-time heterogeneous multi-agent differential graphical games. Inf Sci 317:96–113CrossRef

30.

Abouheaf M, Lewis FL, Haesaert S, Babuska R, Vamvoudakis K (2013) Multi-agent discrete-time graphical games: interactive Nash equilibrium and value iteration solution. In: 2013 American control conference (ACC), pp 4189–4195

31.

Wang CY, Zuo ZY, Sun JY, Yang J, Ding ZT (2017) Consensus disturbance rejection for Lipschitz nonlinear multi-agent systems with input delay: a DOBC approach. J Frankl Inst 354(1):298–315MathSciNetCrossRef

32.

Zhang HG, Jiang H, Luo YH, Xiao GY (2017) Data-driven optimal consensus control for discrete-time multi-agent systems with unknown dynamics using reinforcement learning method. IEEE Trans Ind Electron 64(5):4091–4100CrossRef

33.

Zhang J, Wang Z, Zhang H (2019) Data-based optimal control of multiagent systems: a reinforcement learning design approach. IEEE Trans Cybern 49(12):4441–4449CrossRef

34.

Mu CX, Zhao Q, Gao ZK, Sun CY (2019) Q-learning solution for optimal consensus control of discrete-time multiagent systems using reinforcement learning. J Frankl Inst Eng Appl Math 356(13):6946–6967MathSciNetCrossRef

35.

Abouheaf MI, Lewis FL, Mahmoud MS (2019) Action dependent dual heuristic programming solution for the dynamic graphical games. In: 2018 IEEE conference on decision and control (CDC), pp 2741–2746

36.

Khoo S, Xie L, Man Z (2009) Robust finite-time consensus tracking algorithm for multirobot systems. IEEE/ASME Trans Mechatron 14(2):219–228CrossRef

37.

Abouheaf MI, Lewis FL, Mahmoud MS, Mikulski DG (2015) Discrete-time dynamic graphical games: model-free reinforcement learning solution. Control Theory Technol 13(1):55–69MathSciNetCrossRef

38.

Tijs S (2003) Introduction to game theory. Hindustan Book Agency, GurgaonCrossRef

39.

Modares H, Lewis FL, Naghibi-Sistani M (2013) Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networks. IEEE Trans Neural Netw Learn Syst 24(10):1513–1525CrossRef

40.

Rehan M, Ahn CK, Chadli M (2020) Consensus of one-sided lipschitz multi-agents under input saturation. IEEE Trans Circuits Syst II Exp 67(4):745–749

41.

Razaq MA, Rehan M, Tufail M, Ahn CK (2020) Multiple Lyapunov functions approach for consensus of one-sided Lipschitz multi-agents over switching topologies and input saturation. IEEE Trans Circuits Syst II Exp 67(12):3267–3271CrossRef

Titel: Model-Free Optimal Consensus Control for Multi-agent Systems Based on DHP Algorithm
verfasst von: Haoen Shi
Yanghe Feng
Chaoxu Mu
Yunkai Wu
Publikationsdatum: 19.10.2021
Verlag: Springer US
Erschienen in: Neural Processing Letters / Ausgabe 1/2022
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI: https://doi.org/10.1007/s11063-021-10641-4

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Jonas Klose/© Pine Valley Capital GmbH, Carina Kießling von der Strategieberatung Roland Berger/© Monika Walther Fotografie | ATZ, Beijing Auto Show 2024: Deutsche Hersteller wollen angreifen./© EKH-Pictures / Generated with AI / Stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 1/2022

Detection of Copy-Move Forgery in Digital Image Using Multi-scale, Multi-stage Deep Learning Model

A Novel Fast Fixed-Time Control Strategy and Its Application to Fixed-Time Synchronization Control of Delayed Neural Networks

State Estimation for Genetic Regulatory Networks with Two Delay Components by Using Second-Order Reciprocally Convex Approach

Proposal-Based Graph Attention Networks for Workflow Detection

SynSeq4ED: A Novel Event-Aware Text Representation Learning for Event Detection

A Maximum Consensus Improvement Method for Group Decision Making Under Social Network with Probabilistic Linguistic Information

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.