Skip to main content
Erschienen in: Neural Computing and Applications 2/2013

01.02.2013 | ISNN 2011

A neural-network-based iterative GDHP approach for solving a class of nonlinear optimal control problems with control constraints

verfasst von: Ding Wang, Derong Liu, Dongbin Zhao, Yuzhu Huang, Dehua Zhang

Erschienen in: Neural Computing and Applications | Ausgabe 2/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, a novel neural-network-based iterative adaptive dynamic programming (ADP) algorithm is proposed. It aims at solving the optimal control problem of a class of nonlinear discrete-time systems with control constraints. By introducing a generalized nonquadratic functional, the iterative ADP algorithm through globalized dual heuristic programming technique is developed to design optimal controller with convergence analysis. Three neural networks are constructed as parametric structures to facilitate the implementation of the iterative algorithm. They are used for approximating at each iteration the cost function, the optimal control law, and the controlled nonlinear discrete-time system, respectively. A simulation example is also provided to verify the effectiveness of the control scheme in solving the constrained optimal control problem.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Chen D, Yang J, Mohler RR (2008) On near optimal neural control of multiple-input nonlinear systems. Neural Comput Appl 17(4):327–337CrossRef Chen D, Yang J, Mohler RR (2008) On near optimal neural control of multiple-input nonlinear systems. Neural Comput Appl 17(4):327–337CrossRef
2.
Zurück zum Zitat Lyshevski SE (1996) Constrained optimization and control of nonlinear systems: new results in optimal control. In: Proceedings of the 35th IEEE conference on decision and control, Kobe, Japan, pp 541–546 Lyshevski SE (1996) Constrained optimization and control of nonlinear systems: new results in optimal control. In: Proceedings of the 35th IEEE conference on decision and control, Kobe, Japan, pp 541–546
3.
Zurück zum Zitat Lyshevski SE (1998) Nonlinear discrete-time systems: constrained optimization and application of nonquadratic costs. In: Proceedings of the American control conference, Philadelphia, pp 3699–3703 Lyshevski SE (1998) Nonlinear discrete-time systems: constrained optimization and application of nonquadratic costs. In: Proceedings of the American control conference, Philadelphia, pp 3699–3703
4.
Zurück zum Zitat Bellman RE (1957) Dynamic programming. Princeton University Press, PrincetonMATH Bellman RE (1957) Dynamic programming. Princeton University Press, PrincetonMATH
5.
Zurück zum Zitat Jagannathan S (2006) Neural network control of nonlinear discrete-time systems. CRC Press, Boca RatonMATH Jagannathan S (2006) Neural network control of nonlinear discrete-time systems. CRC Press, Boca RatonMATH
6.
7.
Zurück zum Zitat Werbos PJ (1992) Approximate dynamic programming for real-time control and neural modeling. In: White DA, Sofge DA (eds) Handbook of intelligent control: neural, fuzzy, and adaptive approaches. Van Nostrand Reinhold, New York, pp 493–525 Werbos PJ (1992) Approximate dynamic programming for real-time control and neural modeling. In: White DA, Sofge DA (eds) Handbook of intelligent control: neural, fuzzy, and adaptive approaches. Van Nostrand Reinhold, New York, pp 493–525
8.
Zurück zum Zitat Werbos PJ (2008) ADP: The key direction for future research in intelligent control and understanding brain intelligence. IEEE Trans Syst Man Cybern B Cybern 38(4):898–900CrossRef Werbos PJ (2008) ADP: The key direction for future research in intelligent control and understanding brain intelligence. IEEE Trans Syst Man Cybern B Cybern 38(4):898–900CrossRef
9.
Zurück zum Zitat Werbos PJ (2009) Intelligence in the brain: a theory of how it works and how to build it. Neural Netw 22(3):200–212CrossRef Werbos PJ (2009) Intelligence in the brain: a theory of how it works and how to build it. Neural Netw 22(3):200–212CrossRef
10.
Zurück zum Zitat Murray JJ, Cox CJ, Lendaris GG, Saeks R (2002) Adaptive dynamic programming. IEEE Trans Syst Man Cybern C Appl Rev 32(2):140–153CrossRef Murray JJ, Cox CJ, Lendaris GG, Saeks R (2002) Adaptive dynamic programming. IEEE Trans Syst Man Cybern C Appl Rev 32(2):140–153CrossRef
11.
Zurück zum Zitat Wang FY, Zhang H, Liu D (2009) Adaptive dynamic programming: an introduction. IEEE Comput Intell Mag 4(2):39–47CrossRef Wang FY, Zhang H, Liu D (2009) Adaptive dynamic programming: an introduction. IEEE Comput Intell Mag 4(2):39–47CrossRef
12.
Zurück zum Zitat Lewis FL, Vrabie D (2009) Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits Syst Mag 9(3):32–50MathSciNetCrossRef Lewis FL, Vrabie D (2009) Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits Syst Mag 9(3):32–50MathSciNetCrossRef
13.
Zurück zum Zitat Si J, Barto AG, Powell WB, Wunsch DC (2004) Handbook of learning and approximate dynamic programming. IEEE Press/Wiley, New YorkCrossRef Si J, Barto AG, Powell WB, Wunsch DC (2004) Handbook of learning and approximate dynamic programming. IEEE Press/Wiley, New YorkCrossRef
14.
Zurück zum Zitat Bertsekas DP, Tsitsiklis JN (1996) Neuro-dynamic programming. Athena Scientific, BelmontMATH Bertsekas DP, Tsitsiklis JN (1996) Neuro-dynamic programming. Athena Scientific, BelmontMATH
15.
Zurück zum Zitat Si J, Wang YT (2001) On-line learning control by association and reinforcement. IEEE Trans Neural Netw 12(2):264–276MathSciNetCrossRef Si J, Wang YT (2001) On-line learning control by association and reinforcement. IEEE Trans Neural Netw 12(2):264–276MathSciNetCrossRef
16.
Zurück zum Zitat Liu D, Zhang H (2005) A neural dynamic programming approach for learning control of failure avoidance problems. Int J Intell Control Syst 10(1):21–32 Liu D, Zhang H (2005) A neural dynamic programming approach for learning control of failure avoidance problems. Int J Intell Control Syst 10(1):21–32
17.
Zurück zum Zitat Prokhorov DV, Wunsch DC (1997) Adaptive critic designs. IEEE Trans Neural Netw 8(5):997–1007CrossRef Prokhorov DV, Wunsch DC (1997) Adaptive critic designs. IEEE Trans Neural Netw 8(5):997–1007CrossRef
18.
Zurück zum Zitat Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. The MIT Press, Cambridge Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. The MIT Press, Cambridge
19.
Zurück zum Zitat Hagen ST, Krose B (2003) Neural Q-learning. Neural Comput Appl 12(2):81–88CrossRef Hagen ST, Krose B (2003) Neural Q-learning. Neural Comput Appl 12(2):81–88CrossRef
20.
Zurück zum Zitat Liu D, Xiong X, Zhang Y (2001) Action-dependent adaptive critic designs. In: Proceedings of the international joint conference on neural networks, Washington, vol 2, pp 990–995 Liu D, Xiong X, Zhang Y (2001) Action-dependent adaptive critic designs. In: Proceedings of the international joint conference on neural networks, Washington, vol 2, pp 990–995
21.
Zurück zum Zitat Venayagamoorthy GK, Harley RG, Wunsch DC (2002) Comparison of heuristic dynamic programming and dual heuristic programming adaptive critics for neurocontrol of a turbogenerator. IEEE Trans Neural Netw 13(3):764–773CrossRef Venayagamoorthy GK, Harley RG, Wunsch DC (2002) Comparison of heuristic dynamic programming and dual heuristic programming adaptive critics for neurocontrol of a turbogenerator. IEEE Trans Neural Netw 13(3):764–773CrossRef
22.
Zurück zum Zitat Venayagamoorthy GK, Harley RG, Wunsch DC (2003) Implementation of adaptive critic-based neurocontrollers for turbogenerators in a multimachine power system. IEEE Trans Neural Netw 14(5):1047–1064CrossRef Venayagamoorthy GK, Harley RG, Wunsch DC (2003) Implementation of adaptive critic-based neurocontrollers for turbogenerators in a multimachine power system. IEEE Trans Neural Netw 14(5):1047–1064CrossRef
23.
Zurück zum Zitat Yen GG, Delima PG (2005) Improving the performance of globalized dual heuristic programming for fault tolerant control through an online learning supervisor. IEEE Trans Autom Sci Eng 2(2):121–131CrossRef Yen GG, Delima PG (2005) Improving the performance of globalized dual heuristic programming for fault tolerant control through an online learning supervisor. IEEE Trans Autom Sci Eng 2(2):121–131CrossRef
24.
Zurück zum Zitat Jagannathan S, He P (2008) Neural-network-based state feedback control of a nonlinear discrete-time system in nonstrict feedback form. IEEE Trans Neural Netw 19(12):2073–2087CrossRef Jagannathan S, He P (2008) Neural-network-based state feedback control of a nonlinear discrete-time system in nonstrict feedback form. IEEE Trans Neural Netw 19(12):2073–2087CrossRef
25.
Zurück zum Zitat Cheng T, Lewis FL, Abu-Khalaf M (2007) A neural network solution for fixed-final time optimal control of nonlinear systems. Automatica 43(3):482–490MathSciNetMATHCrossRef Cheng T, Lewis FL, Abu-Khalaf M (2007) A neural network solution for fixed-final time optimal control of nonlinear systems. Automatica 43(3):482–490MathSciNetMATHCrossRef
26.
Zurück zum Zitat Balakrishnan SN, Biega V (1996) Adaptive-critic based neural networks for aircraft optimal control. J Guid Control Dyn 19(4):893–898CrossRef Balakrishnan SN, Biega V (1996) Adaptive-critic based neural networks for aircraft optimal control. J Guid Control Dyn 19(4):893–898CrossRef
27.
Zurück zum Zitat Balakrishnan SN, Ding J, Lewis FL (2008) Issues on stability of ADP feedback controllers for dynamic systems. IEEE Trans Syst Man Cybern B Cybern 38(4):913–917CrossRef Balakrishnan SN, Ding J, Lewis FL (2008) Issues on stability of ADP feedback controllers for dynamic systems. IEEE Trans Syst Man Cybern B Cybern 38(4):913–917CrossRef
28.
Zurück zum Zitat Han D, Balakrishnan SN (2002) State-constrained agile missile control with adaptive critic-based neural networks. IEEE Trans Control Syst Technol 10(4):481–489CrossRef Han D, Balakrishnan SN (2002) State-constrained agile missile control with adaptive critic-based neural networks. IEEE Trans Control Syst Technol 10(4):481–489CrossRef
29.
Zurück zum Zitat Al-Tamimi A, Lewis FL, Abu-Khalaf M (2008) Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Trans Syst Man Cybern B Cybern 38(4):943–949CrossRef Al-Tamimi A, Lewis FL, Abu-Khalaf M (2008) Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Trans Syst Man Cybern B Cybern 38(4):943–949CrossRef
30.
Zurück zum Zitat Zhang H, Wei Q, Luo Y (2008) A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm. IEEE Trans Syst Man Cybern B Cybern 38(4):937–942CrossRef Zhang H, Wei Q, Luo Y (2008) A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm. IEEE Trans Syst Man Cybern B Cybern 38(4):937–942CrossRef
31.
Zurück zum Zitat Vrabie D, Pastravanu O, Abu-Khalaf M, Lewis FL (2009) Adaptive optimal control for continuous-time linear systems based on policy iteration. Automatica 45(2):477–484MathSciNetMATHCrossRef Vrabie D, Pastravanu O, Abu-Khalaf M, Lewis FL (2009) Adaptive optimal control for continuous-time linear systems based on policy iteration. Automatica 45(2):477–484MathSciNetMATHCrossRef
32.
Zurück zum Zitat Liu D, Jin N (2008) \(\varepsilon\)-adaptive dynamic programming for discrete-time systems. In: Proceedings of the international joint conference on neural networks, Hong Kong, pp 1417–1424 Liu D, Jin N (2008) \(\varepsilon\)-adaptive dynamic programming for discrete-time systems. In: Proceedings of the international joint conference on neural networks, Hong Kong, pp 1417–1424
33.
Zurück zum Zitat Abu-Khalaf M, Lewis FL (2005) Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 41(5):779–791MathSciNetMATHCrossRef Abu-Khalaf M, Lewis FL (2005) Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 41(5):779–791MathSciNetMATHCrossRef
34.
Zurück zum Zitat Zhang H, Luo Y, Liu D (2009) Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints. IEEE Trans Neural Netw 20(9):1490–1503CrossRef Zhang H, Luo Y, Liu D (2009) Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints. IEEE Trans Neural Netw 20(9):1490–1503CrossRef
35.
Zurück zum Zitat Vamvoudakis KG, Lewis FL (2010) Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica 46(5):878–888MathSciNetMATHCrossRef Vamvoudakis KG, Lewis FL (2010) Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica 46(5):878–888MathSciNetMATHCrossRef
36.
Zurück zum Zitat Zhang H, Wei Q, Liu D (2011) An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games. Automatica 47(1):207–214MathSciNetMATHCrossRef Zhang H, Wei Q, Liu D (2011) An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games. Automatica 47(1):207–214MathSciNetMATHCrossRef
37.
Zurück zum Zitat Song R, Zhang H, Luo Y, Wei Q (2010) Optimal control laws for time-delay systems with saturating actuators based on heuristic dynamic programming. Neurocomputing 73(16–18):3020–3027CrossRef Song R, Zhang H, Luo Y, Wei Q (2010) Optimal control laws for time-delay systems with saturating actuators based on heuristic dynamic programming. Neurocomputing 73(16–18):3020–3027CrossRef
38.
Zurück zum Zitat Ma J, Yang T, Hou ZG, Tan M, Liu D (2008) Neurodynamic programming: a case study of the traveling salesman problem. Neural Comput Appl 17(4):347–355CrossRef Ma J, Yang T, Hou ZG, Tan M, Liu D (2008) Neurodynamic programming: a case study of the traveling salesman problem. Neural Comput Appl 17(4):347–355CrossRef
Metadaten
Titel
A neural-network-based iterative GDHP approach for solving a class of nonlinear optimal control problems with control constraints
verfasst von
Ding Wang
Derong Liu
Dongbin Zhao
Yuzhu Huang
Dehua Zhang
Publikationsdatum
01.02.2013
Verlag
Springer-Verlag
Erschienen in
Neural Computing and Applications / Ausgabe 2/2013
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-011-0707-2

Weitere Artikel der Ausgabe 2/2013

Neural Computing and Applications 2/2013 Zur Ausgabe

Premium Partner