Skip to main content
Erschienen in: Neural Processing Letters 3/2020

28.02.2020

Neural Network-Based Optimal Tracking Control of Continuous-Time Uncertain Nonlinear System via Reinforcement Learning

verfasst von: Jingang Zhao

Erschienen in: Neural Processing Letters | Ausgabe 3/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this note, optimal tracking control for uncertain continuous-time nonlinear system is investigated by using a novel reinforcement learning (RL) scheme. The uncertainty here refers to unknown system drift dynamics. Based on the nonlinear system and reference signal, we firstly formulate the tracking problem by constructing an augmented system. The optimal tracking control problem for original nonlinear system is thus transformed into solving the Hamilton–Jacobi–Bellman (HJB) equation of the augmented system. A new single neural network (NN)-based online RL method is proposed to learn the solution of tracking HJB equation while the corresponding optimal control input that minimizes the tracking HJB equation is calculated in a forward-in-time manner without requiring any value, policy iterations and the system drift dynamics. In order to relax the dependence of the RL method on traditional Persistence of Excitation (PE) conditions, a concurrent learning technique is adopted to design the NN tuning laws. The Uniformly Ultimately Boundedness of NN weight errors and closed-loop augmented system states are rigorous proved. Three numerical simulation examples are given to demonstrate the effectiveness of the proposed scheme.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Lewis FL, Jagannathan S, Yesildirek A (1998) Neural network control of robot manipulators and nonlinear systems. Taylor & Francis, Philadelphia, PA Lewis FL, Jagannathan S, Yesildirek A (1998) Neural network control of robot manipulators and nonlinear systems. Taylor & Francis, Philadelphia, PA
2.
Zurück zum Zitat Mahony R, Hamel T (2004) Robust trajectory tracking for a scale model autonomous helicopter. Int J Robust Nonlinear Control 14(12):1035MathSciNetCrossRef Mahony R, Hamel T (2004) Robust trajectory tracking for a scale model autonomous helicopter. Int J Robust Nonlinear Control 14(12):1035MathSciNetCrossRef
3.
Zurück zum Zitat Huang J, Wen C, Wang W, Jiang ZP (2014) Adaptive output feedback tracking control of a nonholonomic mobile robot. Automatica 50(3):821MathSciNetCrossRef Huang J, Wen C, Wang W, Jiang ZP (2014) Adaptive output feedback tracking control of a nonholonomic mobile robot. Automatica 50(3):821MathSciNetCrossRef
4.
Zurück zum Zitat Tang X, Tao G, Joshi SM (2003) Adaptive actuator failure compensation for parametric strict feedback systems and an aircraft application. Automatica 39(11):1975MathSciNetCrossRef Tang X, Tao G, Joshi SM (2003) Adaptive actuator failure compensation for parametric strict feedback systems and an aircraft application. Automatica 39(11):1975MathSciNetCrossRef
5.
Zurück zum Zitat Lewis FL, Vrabie DL, Syrmos VL (2015) Optimal control, 3rd edn. Wiley, New YorkMATH Lewis FL, Vrabie DL, Syrmos VL (2015) Optimal control, 3rd edn. Wiley, New YorkMATH
6.
Zurück zum Zitat Mannava A, Balakrishnan SN, Tang L, Landers RG (2012) Optimal tracking control of motion systems. IEEE Trans Control Syst Technol 20(6):1548CrossRef Mannava A, Balakrishnan SN, Tang L, Landers RG (2012) Optimal tracking control of motion systems. IEEE Trans Control Syst Technol 20(6):1548CrossRef
7.
Zurück zum Zitat Sharma R, Tewari A (2013) Optimal nonlinear tracking of spacecraft attitude maneuvers. IEEE Trans Control Syst Technol 12(5):677CrossRef Sharma R, Tewari A (2013) Optimal nonlinear tracking of spacecraft attitude maneuvers. IEEE Trans Control Syst Technol 12(5):677CrossRef
8.
Zurück zum Zitat Liu T, Liang S, Xiong Q, Wang K (2018) Adaptive critic based optimal neurocontrol of a distributed microwave heating system using diagonal recurrent network. IEEE Access 6:68839CrossRef Liu T, Liang S, Xiong Q, Wang K (2018) Adaptive critic based optimal neurocontrol of a distributed microwave heating system using diagonal recurrent network. IEEE Access 6:68839CrossRef
10.
Zurück zum Zitat Sutton R, Barto A (2018) Reinforcement learning: an introduction. The MIT Press, CambridgeMATH Sutton R, Barto A (2018) Reinforcement learning: an introduction. The MIT Press, CambridgeMATH
11.
Zurück zum Zitat Lewis FL, Liu D (2015) Reinforcement learning and approximate dynamic programming for feedback control. IEEE Circuits Syst Mag 9(3):32CrossRef Lewis FL, Liu D (2015) Reinforcement learning and approximate dynamic programming for feedback control. IEEE Circuits Syst Mag 9(3):32CrossRef
13.
Zurück zum Zitat Qiao L, Wei Q, Liu D (2017) A novel optimal tracking control scheme for a class of discrete-time nonlinear systems using generalised policy iteration adaptive dynamic programming algorithm. Int J Syst Sci 48(3):525MathSciNetCrossRef Qiao L, Wei Q, Liu D (2017) A novel optimal tracking control scheme for a class of discrete-time nonlinear systems using generalised policy iteration adaptive dynamic programming algorithm. Int J Syst Sci 48(3):525MathSciNetCrossRef
14.
Zurück zum Zitat Zhang H, Wei Q, Luo Y (2008) A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm. IEEE Trans Syst Man Cybern Part B 38(4):937CrossRef Zhang H, Wei Q, Luo Y (2008) A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm. IEEE Trans Syst Man Cybern Part B 38(4):937CrossRef
15.
Zurück zum Zitat Zhang H, Cui L, Zhang X, Luo Y (2011) Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method. IEEE Trans Neural Netw 22(12):2226CrossRef Zhang H, Cui L, Zhang X, Luo Y (2011) Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method. IEEE Trans Neural Netw 22(12):2226CrossRef
16.
Zurück zum Zitat Xiong Y, Liu D, Ding W (2014) Reinforcement learning for adaptive optimal control of unknown continuous-time nonlinear systems with input constraints. Int J Control 87(3):553MathSciNetCrossRef Xiong Y, Liu D, Ding W (2014) Reinforcement learning for adaptive optimal control of unknown continuous-time nonlinear systems with input constraints. Int J Control 87(3):553MathSciNetCrossRef
17.
Zurück zum Zitat Kamalapurkar R, Andrews L, Walters P, Dixon WE (2017) Model-based reinforcement learning for infinite-horizon approximate optimal tracking. IEEE Trans Neural Netw Learn Syst 28(3):753CrossRef Kamalapurkar R, Andrews L, Walters P, Dixon WE (2017) Model-based reinforcement learning for infinite-horizon approximate optimal tracking. IEEE Trans Neural Netw Learn Syst 28(3):753CrossRef
18.
Zurück zum Zitat Kiumarsi B, Lewis FL (2017) Actor-critic-based optimal tracking for partially unknown nonlinear discrete-time systems. IEEE Trans Neural Netw Learn Syst 26(1):140MathSciNetCrossRef Kiumarsi B, Lewis FL (2017) Actor-critic-based optimal tracking for partially unknown nonlinear discrete-time systems. IEEE Trans Neural Netw Learn Syst 26(1):140MathSciNetCrossRef
19.
Zurück zum Zitat Modares H, Lewis FL (2014) Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning. Automatica 50(7):1780MathSciNetCrossRef Modares H, Lewis FL (2014) Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning. Automatica 50(7):1780MathSciNetCrossRef
20.
Zurück zum Zitat Modares H, Lewis FL (2014) Linear quadratic tracking control of partially-unknown continuous-time systems using reinforcement learning. IEEE Trans Autom Control 59(11):3051MathSciNetCrossRef Modares H, Lewis FL (2014) Linear quadratic tracking control of partially-unknown continuous-time systems using reinforcement learning. IEEE Trans Autom Control 59(11):3051MathSciNetCrossRef
21.
Zurück zum Zitat Kiumarsi-Khomartash B, Lewis FL, Naghibi-Sistani M, Karimpour A (2013) Optimal tracking control for linear discrete-time systems using reinforcement learning. In: 52nd IEEE Conference on Decision and Control, Florence, pp 3845–3850 Kiumarsi-Khomartash B, Lewis FL, Naghibi-Sistani M, Karimpour A (2013) Optimal tracking control for linear discrete-time systems using reinforcement learning. In: 52nd IEEE Conference on Decision and Control, Florence, pp 3845–3850
22.
Zurück zum Zitat Kiumarsi B, Lewis FL, Naghibi-Sistani MB, Karimpour A (2015) Optimal tracking control of unknown discrete-time linear systems using input–output measured data. IEEE Trans Cybern 45(12):2770CrossRef Kiumarsi B, Lewis FL, Naghibi-Sistani MB, Karimpour A (2015) Optimal tracking control of unknown discrete-time linear systems using input–output measured data. IEEE Trans Cybern 45(12):2770CrossRef
23.
Zurück zum Zitat Wei Q, Liu D (2014) Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification. IEEE Trans Autom Sci Eng 11(4):1020MathSciNetCrossRef Wei Q, Liu D (2014) Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification. IEEE Trans Autom Sci Eng 11(4):1020MathSciNetCrossRef
24.
Zurück zum Zitat Lin X, Qiang D, Kong W, Song C, Huang Q (2015) Adaptive dynamic programming-based optimal tracking control for nonlinear systems using general value iteration Lin X, Qiang D, Kong W, Song C, Huang Q (2015) Adaptive dynamic programming-based optimal tracking control for nonlinear systems using general value iteration
25.
Zurück zum Zitat Zhang H, Song R, Wei Q, Zhang T (2011) Optimal tracking control for a class of nonlinear discrete-time systems with time delays based on heuristic dynamic programming. IEEE Trans Neural Netw 22(12):1851CrossRef Zhang H, Song R, Wei Q, Zhang T (2011) Optimal tracking control for a class of nonlinear discrete-time systems with time delays based on heuristic dynamic programming. IEEE Trans Neural Netw 22(12):1851CrossRef
26.
Zurück zum Zitat Gao W, Jiang Z (2016) Adaptive dynamic programming and adaptive optimal output regulation of linear systems. IEEE Trans Autom Control 61(12):4164MathSciNetCrossRef Gao W, Jiang Z (2016) Adaptive dynamic programming and adaptive optimal output regulation of linear systems. IEEE Trans Autom Control 61(12):4164MathSciNetCrossRef
27.
Zurück zum Zitat Han KZ, Jian F, Cui X (2017) Fault-tolerant optimised tracking control for unknown discrete-time linear systems using a combined reinforcement learning and residual compensation methodology. Int J Syst Sci 48(13):2811MathSciNetCrossRef Han KZ, Jian F, Cui X (2017) Fault-tolerant optimised tracking control for unknown discrete-time linear systems using a combined reinforcement learning and residual compensation methodology. Int J Syst Sci 48(13):2811MathSciNetCrossRef
28.
Zurück zum Zitat Zhang H, Wei Q, Luo Y (2008) A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm. IEEE Trans Syst Man Cybern Part B (Cybern) 38(4):937CrossRef Zhang H, Wei Q, Luo Y (2008) A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm. IEEE Trans Syst Man Cybern Part B (Cybern) 38(4):937CrossRef
30.
Zurück zum Zitat Bertsekas DP (2005) Dynamic programming and optimal control, 3rd edn. Athena Scientific, Belmont, MAMATH Bertsekas DP (2005) Dynamic programming and optimal control, 3rd edn. Athena Scientific, Belmont, MAMATH
31.
Zurück zum Zitat Bruce FA (1990) The method of weighted residuals and variational principles. Academic Press, New York Bruce FA (1990) The method of weighted residuals and variational principles. Academic Press, New York
32.
Zurück zum Zitat Abu-Khalaf M, Lewis FL (2005) Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 41(5):779MathSciNetCrossRef Abu-Khalaf M, Lewis FL (2005) Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 41(5):779MathSciNetCrossRef
33.
Zurück zum Zitat Vamvoudakis KG, Lewis FL (2010) Online actorcritic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica 46(5):878MathSciNetCrossRef Vamvoudakis KG, Lewis FL (2010) Online actorcritic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica 46(5):878MathSciNetCrossRef
34.
Zurück zum Zitat Luy NT (2014) Reinforecement learning-based optimal tracking control for wheeled mobile robot. Trans Inst Meas Control 36(7):171CrossRef Luy NT (2014) Reinforecement learning-based optimal tracking control for wheeled mobile robot. Trans Inst Meas Control 36(7):171CrossRef
35.
Zurück zum Zitat Zargarzadeh H, Dierks T, Jagannathan S (2015) Optimal control of nonlinear continuous-time systems in strict-feedback form. IEEE Trans Neural Netw Learn Syst 26(10):2535MathSciNetCrossRef Zargarzadeh H, Dierks T, Jagannathan S (2015) Optimal control of nonlinear continuous-time systems in strict-feedback form. IEEE Trans Neural Netw Learn Syst 26(10):2535MathSciNetCrossRef
36.
Zurück zum Zitat Vamvoudakis KG (2017) Q-learning for continuous-time linear systems: a model-free infinite horizon optimal control approach. Syst Control Lett 100(Complete):14MathSciNetCrossRef Vamvoudakis KG (2017) Q-learning for continuous-time linear systems: a model-free infinite horizon optimal control approach. Syst Control Lett 100(Complete):14MathSciNetCrossRef
37.
Zurück zum Zitat Modares H, Lewis FL, Naghibi-Sistani MB (2014) Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems. Automatica 50(1):193MathSciNetCrossRef Modares H, Lewis FL, Naghibi-Sistani MB (2014) Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems. Automatica 50(1):193MathSciNetCrossRef
38.
Zurück zum Zitat Chowdhary G, Johnson E (2010) Concurrent learning for convergence in adaptive control without persistency of excitation. In: 49th IEEE Conference on Decision and Control (CDC), Atlanta, GA, pp 3674–3679 Chowdhary G, Johnson E (2010) Concurrent learning for convergence in adaptive control without persistency of excitation. In: 49th IEEE Conference on Decision and Control (CDC), Atlanta, GA, pp 3674–3679
39.
Zurück zum Zitat Vamvoudakis KG, Mojoodi A, Ferraz H (2017) Eventtriggered optimal tracking control of nonlinear systems. Int J Robust Nonlinear Control 27(4):598–619CrossRef Vamvoudakis KG, Mojoodi A, Ferraz H (2017) Eventtriggered optimal tracking control of nonlinear systems. Int J Robust Nonlinear Control 27(4):598–619CrossRef
40.
Zurück zum Zitat Yang X, Liu D, Wei Q, Wang D (2016) Guaranteed cost neural tracking control for a class of uncertain nonlinear systems using adaptive dynamic programming. Neurocomputing 198:80CrossRef Yang X, Liu D, Wei Q, Wang D (2016) Guaranteed cost neural tracking control for a class of uncertain nonlinear systems using adaptive dynamic programming. Neurocomputing 198:80CrossRef
Metadaten
Titel
Neural Network-Based Optimal Tracking Control of Continuous-Time Uncertain Nonlinear System via Reinforcement Learning
verfasst von
Jingang Zhao
Publikationsdatum
28.02.2020
Verlag
Springer US
Erschienen in
Neural Processing Letters / Ausgabe 3/2020
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-020-10220-z

Weitere Artikel der Ausgabe 3/2020

Neural Processing Letters 3/2020 Zur Ausgabe

Neuer Inhalt