Skip to main content
Erschienen in: Optical Memory and Neural Networks 1/2024

01.03.2024

Lateral Motion Control of a Maneuverable Aircraft Using Reinforcement Learning

verfasst von: Yu. V. Tiumentsev, R. A. Zarubin

Erschienen in: Optical Memory and Neural Networks | Ausgabe 1/2024

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Machine learning is currently one of the most actively developing research areas. Considerable attention in the ongoing research is paid to problems related to dynamical systems. One of the areas in which the application of machine learning technologies is being actively explored is aircraft of various types and purposes. This state of the art is due to the complexity and variety of tasks that are assigned to aircraft. The complicating factor in this case is incomplete and inaccurate knowledge of the properties of the object under study and the conditions in which it operates. In particular, a variety of abnormal situations may occur during flight, such as equipment failures and structural damage, which must be counteracted by reconfiguring the aircraft’s control system and controls. The aircraft control system must be able to operate effectively under these conditions by promptly changing the parameters and/or structure of the control laws used. Adaptive control methods allow to satisfy this requirement. One of the ways to synthesize control laws for dynamic systems, widely used nowadays, is LQR approach. A significant limitation of this approach is the lack of adaptability of the resulting control law, which prevents its use in conditions of incomplete and inaccurate knowledge of the properties of the control object and the environment in which it operates. To overcome this limitation, it was proposed to modify the standard variant of LQR (Linear Quadratic Regulator) based on approximate dynamic programming, a special case of which is the adaptive critic design (ACD) method. For the ACD-LQR combination, the problem of controlling the lateral motion of a maneuvering aircraft is solved. The results obtained demonstrate the promising potential of this approach to controlling the airplane motion under uncertainty conditions.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Meyn, S., Control Systems and Reinforcement Learning, Cambridge, UK: Cambridge Univ. Press, 2022.CrossRef Meyn, S., Control Systems and Reinforcement Learning, Cambridge, UK: Cambridge Univ. Press, 2022.CrossRef
2.
Zurück zum Zitat Song, R., Wei, Q., and Li, Q., Adaptive Dynamic Programming: Single and Multiple Controllers, Beijing: Science Press; Singapore: Springer Nature, 2019. Song, R., Wei, Q., and Li, Q., Adaptive Dynamic Programming: Single and Multiple Controllers, Beijing: Science Press; Singapore: Springer Nature, 2019.
3.
Zurück zum Zitat Zhang, Y., Li, S., and Zhou, X., Deep Reinforcement Learning with Guaranteed Performance: A Lyapunov-Based Approach, Springer Nature Switzerland, 2020.CrossRef Zhang, Y., Li, S., and Zhou, X., Deep Reinforcement Learning with Guaranteed Performance: A Lyapunov-Based Approach, Springer Nature Switzerland, 2020.CrossRef
4.
Zurück zum Zitat Kamalapurkar, R., Walters, P., Rosenfeld, J., and Dixon W., Reinforcement Learning for Optimal Feedback Control: A Lyapunov-based approach, Berlin: Springer, 2018.CrossRef Kamalapurkar, R., Walters, P., Rosenfeld, J., and Dixon W., Reinforcement Learning for Optimal Feedback Control: A Lyapunov-based approach, Berlin: Springer, 2018.CrossRef
5.
Zurück zum Zitat Vamvoudakis, K.G., Wan, Y., Lewis, F.L., and Cansever, D., Eds., Handbook of Reinforcement Learning and Control, Springer Nature Switzerland, 2021. Vamvoudakis, K.G., Wan, Y., Lewis, F.L., and Cansever, D., Eds., Handbook of Reinforcement Learning and Control, Springer Nature Switzerland, 2021.
6.
Zurück zum Zitat Liu, D., Xue, S., Zhao, B., Luo, B., and Wei, Q., Adaptive dynamic programming for vontrol: A survey and recent advances. IEEE Trans. Syst., Man, Cybern., Part B, 2023, vol. 1, pp. 142–160. Liu, D., Xue, S., Zhao, B., Luo, B., and Wei, Q., Adaptive dynamic programming for vontrol: A survey and recent advances. IEEE Trans. Syst., Man, Cybern., Part B, 2023, vol. 1, pp. 142–160.
7.
Zurück zum Zitat Wang, D., He, H., and Liu D., Adaptive critic nonlinear robust control: A survey, IEEE Trans. Cybern., 2017, vol. 47, no. 10, pp. 1–22.CrossRef Wang, D., He, H., and Liu D., Adaptive critic nonlinear robust control: A survey, IEEE Trans. Cybern., 2017, vol. 47, no. 10, pp. 1–22.CrossRef
8.
Zurück zum Zitat Buşoniu, L., de Bruin, T., Tolić, D., Kober, J., and Palunko, I., Reinforcement learning for control: Performance, stability, and deep approximators, Annu. Rev. Control, 2018, vol. 46, pp. 8–28.MathSciNetCrossRef Buşoniu, L., de Bruin, T., Tolić, D., Kober, J., and Palunko, I., Reinforcement learning for control: Performance, stability, and deep approximators, Annu. Rev. Control, 2018, vol. 46, pp. 8–28.MathSciNetCrossRef
9.
Zurück zum Zitat Khan, S.G., Herrmann, G., Lewis, F.L., Pipe, T., and Melhuish, C., Reinforcement learning and optimal adaptive control: An overview and implementation examples, Annu. Rev. Control, 2012, vol. 36, pp. 42–59.CrossRef Khan, S.G., Herrmann, G., Lewis, F.L., Pipe, T., and Melhuish, C., Reinforcement learning and optimal adaptive control: An overview and implementation examples, Annu. Rev. Control, 2012, vol. 36, pp. 42–59.CrossRef
10.
Zurück zum Zitat Kiumarsi, B., Vamvoudakis, K.G., Modares, H., and Lewis, F.L., Optimal and autonomous control using reinforcement learning: A survey, IEEE Trans. Neural Networks Learn. Syst., 2018, vol. 29, pp. 2042–2062.MathSciNetCrossRef Kiumarsi, B., Vamvoudakis, K.G., Modares, H., and Lewis, F.L., Optimal and autonomous control using reinforcement learning: A survey, IEEE Trans. Neural Networks Learn. Syst., 2018, vol. 29, pp. 2042–2062.MathSciNetCrossRef
11.
Zurück zum Zitat Kober, J., Bagnell, J.A., and Peters, J., Reinforcement learning in robotics: A survey, Int. J. Rob. Res., 2013, vol. 22, pp. 1238–1274.CrossRef Kober, J., Bagnell, J.A., and Peters, J., Reinforcement learning in robotics: A survey, Int. J. Rob. Res., 2013, vol. 22, pp. 1238–1274.CrossRef
12.
Zurück zum Zitat Lewis, F.L. and Vrabie, D., Reinforcement learning and adaptive dynamic programming for feedback control, IEEE Circuits Syst. Mag., 2009, vol. 9, no. 3. pp. 32–50.CrossRef Lewis, F.L. and Vrabie, D., Reinforcement learning and adaptive dynamic programming for feedback control, IEEE Circuits Syst. Mag., 2009, vol. 9, no. 3. pp. 32–50.CrossRef
13.
Zurück zum Zitat Li, Y., Deep reinforcement learning: An overview. arXiv, 2018, arXiv:1810.06339v1, pp. 1–150. Li, Y., Deep reinforcement learning: An overview. arXiv, 2018, arXiv:1810.06339v1, pp. 1–150.
14.
Zurück zum Zitat Ducard, G.J.J., Fault-tolerant Flight Control and Guidance Systems: Practical Methods for Small Unmanned Aerial Vehicles; Springer: Berlin, 2009.CrossRef Ducard, G.J.J., Fault-tolerant Flight Control and Guidance Systems: Practical Methods for Small Unmanned Aerial Vehicles; Springer: Berlin, 2009.CrossRef
15.
Zurück zum Zitat Hajlyev, C. and Caliskan, F., Fault Diagnosis and Reconfiguration in Flight Control Systems, Springer: Berlin, 2003.CrossRef Hajlyev, C. and Caliskan, F., Fault Diagnosis and Reconfiguration in Flight Control Systems, Springer: Berlin, 2003.CrossRef
16.
Zurück zum Zitat Blanke, M., Kinnaert, M., Lunze, J., and Staroswiecki, M., Diagnosis and Fault-Tolerant Control, 2nd ed.; Springer: Berlin, 2006. Blanke, M., Kinnaert, M., Lunze, J., and Staroswiecki, M., Diagnosis and Fault-Tolerant Control, 2nd ed.; Springer: Berlin, 2006.
17.
Zurück zum Zitat Noura, H., Theilliol, D., Ponsart, J.-C., and Chamseddine, A., Fault-tolerant Control Systems: Design and Practical Applications, Springer: Berlin, 2009.CrossRef Noura, H., Theilliol, D., Ponsart, J.-C., and Chamseddine, A., Fault-tolerant Control Systems: Design and Practical Applications, Springer: Berlin, 2009.CrossRef
18.
Zurück zum Zitat Zhou, J., Xing, L., and Wen, C. Adaptive Control of Dynamic Systems with Uncertainty and Quantization, London, UK: CRC Press, 2021.CrossRef Zhou, J., Xing, L., and Wen, C. Adaptive Control of Dynamic Systems with Uncertainty and Quantization, London, UK: CRC Press, 2021.CrossRef
19.
Zurück zum Zitat Astolfi A., Karagiannis D., and Ortega R., Nonlinear and Adaptive Control with Applications, Berlin a.o.: Springer, 2008. Astolfi A., Karagiannis D., and Ortega R., Nonlinear and Adaptive Control with Applications, Berlin a.o.: Springer, 2008.
20.
Zurück zum Zitat Ioannou, P.A. and Sun, J., Robust Adaptive Control, Prentice Hall, 1995. Ioannou, P.A. and Sun, J., Robust Adaptive Control, Prentice Hall, 1995.
21.
Zurück zum Zitat Mosca, E., Optimal, Predictive, and Adaptive Control, Prentice Hall, 1994. Mosca, E., Optimal, Predictive, and Adaptive Control, Prentice Hall, 1994.
22.
23.
Zurück zum Zitat Sutton, R.S. and Barto, A.G., Reinforcement Learning: An Introduction. 2nd ed., Cambridge, Massachusetts, USA: MIT Press, 2018. Sutton, R.S. and Barto, A.G., Reinforcement Learning: An Introduction. 2nd ed., Cambridge, Massachusetts, USA: MIT Press, 2018.
24.
Zurück zum Zitat Wei, Q., Song, R., Li, B., and Lin, X., Self-learning Optimal Control of Nonlinear Systems: Adaptive Dynamic Programming Approach, Springer, 2018.CrossRef Wei, Q., Song, R., Li, B., and Lin, X., Self-learning Optimal Control of Nonlinear Systems: Adaptive Dynamic Programming Approach, Springer, 2018.CrossRef
25.
Zurück zum Zitat Haykin, S., Neural Networks: A Comprehensive Foundation, 2nd ed., Prentice Hall, 2006. Haykin, S., Neural Networks: A Comprehensive Foundation, 2nd ed., Prentice Hall, 2006.
26.
Zurück zum Zitat Powell, W.B., Approximate Dynamic Programming: Solving the Curse of Dimensionality, 2nd ed., Wiley, 2011.CrossRef Powell, W.B., Approximate Dynamic Programming: Solving the Curse of Dimensionality, 2nd ed., Wiley, 2011.CrossRef
27.
Zurück zum Zitat Lewis, F.L. and Liu, D., Eds., Reinforcement Learning and Approximate Dynamic Programming for Feedback Control. Wiley, 2013. Lewis, F.L. and Liu, D., Eds., Reinforcement Learning and Approximate Dynamic Programming for Feedback Control. Wiley, 2013.
28.
Zurück zum Zitat Liu, D., Xue, S., Zhao, B., Luo, B., and Wei, Q., Adaptive dynamic programming for control: A survey and recent advances, IEEE Trans. Syst., Man, Cybern., 2021, vol. 51, no. 1, pp.142–160.CrossRef Liu, D., Xue, S., Zhao, B., Luo, B., and Wei, Q., Adaptive dynamic programming for control: A survey and recent advances, IEEE Trans. Syst., Man, Cybern., 2021, vol. 51, no. 1, pp.142–160.CrossRef
29.
Zurück zum Zitat Liu, D., Wei, Q., Wang, D., Yang, X., and Li, H., Adaptive Dynamic Programming with Applications in Optimal Control, Springer, 2017.CrossRef Liu, D., Wei, Q., Wang, D., Yang, X., and Li, H., Adaptive Dynamic Programming with Applications in Optimal Control, Springer, 2017.CrossRef
30.
Zurück zum Zitat Ferrari, S., Stengel, R.F., Online adaptive critic flight control, J. Guid., Control, Dyn., 2004, vol. 27, no. 5, pp. 777–786.CrossRef Ferrari, S., Stengel, R.F., Online adaptive critic flight control, J. Guid., Control, Dyn., 2004, vol. 27, no. 5, pp. 777–786.CrossRef
31.
Zurück zum Zitat Wang, D. and Mu, C., Adaptive Critic Control with Robust Stabilization for Uncertain Nonlinear Systems, Springer, 2019.CrossRef Wang, D. and Mu, C., Adaptive Critic Control with Robust Stabilization for Uncertain Nonlinear Systems, Springer, 2019.CrossRef
32.
Zurück zum Zitat Lewis, F.L., Vrabie, D.L., and Syrmos, V.L. Optimal Control, 3rd ed., Hoboken, New Jersey: Wiley, 2012.CrossRef Lewis, F.L., Vrabie, D.L., and Syrmos, V.L. Optimal Control, 3rd ed., Hoboken, New Jersey: Wiley, 2012.CrossRef
33.
Zurück zum Zitat Rugh, W.J. and Shamma J.S., Research on gain scheduling: Survey paper, Automatica, 2000, vol. 36, no. 10, pp.1401–1425.MathSciNetCrossRef Rugh, W.J. and Shamma J.S., Research on gain scheduling: Survey paper, Automatica, 2000, vol. 36, no. 10, pp.1401–1425.MathSciNetCrossRef
34.
Zurück zum Zitat Leith, D.J. and Leithead W.E., Survey of gain scheduling analysis and design, Int. J. Control, 2000, vol. 73, no. 11, pp. 1001–1025.MathSciNetCrossRef Leith, D.J. and Leithead W.E., Survey of gain scheduling analysis and design, Int. J. Control, 2000, vol. 73, no. 11, pp. 1001–1025.MathSciNetCrossRef
35.
Zurück zum Zitat Enns, D., Bugajski, D., Hendrick, R., and Stein G., Dynamic inversion: an evolving methodology for flight control design, Int. J. Control, 1994, vol. 59, no. 1, pp. 71–91.CrossRef Enns, D., Bugajski, D., Hendrick, R., and Stein G., Dynamic inversion: an evolving methodology for flight control design, Int. J. Control, 1994, vol. 59, no. 1, pp. 71–91.CrossRef
36.
Zurück zum Zitat Looye, G., Design of robust autopilot control laws with nonlinear dynamic inversion, Automatisierungstechnik, 2001, vol. 49, no. 12, pp. 523–531.CrossRef Looye, G., Design of robust autopilot control laws with nonlinear dynamic inversion, Automatisierungstechnik, 2001, vol. 49, no. 12, pp. 523–531.CrossRef
37.
Zurück zum Zitat Werbos, P.J., A menu of designs for reinforcement learning over time, in Neural Networks for Control, Miller, W.T., Sutton, R.S., and Werbos, P.J., Eds., Cambridge, MA: MIT Press, 1990, pp. 67–95. Werbos, P.J., A menu of designs for reinforcement learning over time, in Neural Networks for Control, Miller, W.T., Sutton, R.S., and Werbos, P.J., Eds., Cambridge, MA: MIT Press, 1990, pp. 67–95.
38.
Zurück zum Zitat Vamvoudakis, K.G. and Lewis, F.L., Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem, Automatica, 2010, vol. 46, pp. 878–888.MathSciNetCrossRef Vamvoudakis, K.G. and Lewis, F.L., Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem, Automatica, 2010, vol. 46, pp. 878–888.MathSciNetCrossRef
39.
Zurück zum Zitat Wang, D. and Mu, C., Adaptive Critic Control with Robust Stabilization for Uncertain Nonlinear Systems; Singapore: Springer Nature, 2019.CrossRef Wang, D. and Mu, C., Adaptive Critic Control with Robust Stabilization for Uncertain Nonlinear Systems; Singapore: Springer Nature, 2019.CrossRef
40.
Zurück zum Zitat Bradtke, S.J., Reinforcement learning applied to linear quadratic regulation, Proc. NIPS-92, 1992, pp. 295–302. Bradtke, S.J., Reinforcement learning applied to linear quadratic regulation, Proc. NIPS-92, 1992, pp. 295–302.
41.
Zurück zum Zitat Faradonbeh, M.K.S., Tewari, A., and Michailidis, G., On adaptive linear-quadratic regulators, Automatica, 2020, vol. 117, pp. 1–13.MathSciNet Faradonbeh, M.K.S., Tewari, A., and Michailidis, G., On adaptive linear-quadratic regulators, Automatica, 2020, vol. 117, pp. 1–13.MathSciNet
42.
Zurück zum Zitat Lee, J.Y., Park, J.B., and Choi Y.H., Integral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems, Automatica, 2012, vol. 48, pp. 2850–2859.MathSciNetCrossRef Lee, J.Y., Park, J.B., and Choi Y.H., Integral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems, Automatica, 2012, vol. 48, pp. 2850–2859.MathSciNetCrossRef
43.
Zurück zum Zitat Lee, J.Y., Park, J.B., and Choi, Y.H., On integral generalized policy iteration for continuous-time linear quadratic regulations, Automatica, 2014, vol. 50, pp. 475–489.MathSciNetCrossRef Lee, J.Y., Park, J.B., and Choi, Y.H., On integral generalized policy iteration for continuous-time linear quadratic regulations, Automatica, 2014, vol. 50, pp. 475–489.MathSciNetCrossRef
44.
Zurück zum Zitat Nguyen, L.T., Ogburn, M.E., Gilbert, W.P., Kibler, K.S., Brown, P.W., and Deal, P.L., Simulator study of stall/post-stall characteristics of a fighter airplane with relaxed longitudinal static stability, NASA TP-1538, 1979. Nguyen, L.T., Ogburn, M.E., Gilbert, W.P., Kibler, K.S., Brown, P.W., and Deal, P.L., Simulator study of stall/post-stall characteristics of a fighter airplane with relaxed longitudinal static stability, NASA TP-1538, 1979.
45.
Zurück zum Zitat Chulin, M.A., Tiumentsev, Yu.V., and Zarubin, R.A., LQR approach to aircraft control based on the adaptive critic design, Stud. Comput. Intell., 2023, vol. 1120, pp. 406–419.CrossRef Chulin, M.A., Tiumentsev, Yu.V., and Zarubin, R.A., LQR approach to aircraft control based on the adaptive critic design, Stud. Comput. Intell., 2023, vol. 1120, pp. 406–419.CrossRef
46.
Zurück zum Zitat Stevens, B.L., Lewis, F.L., and Johnson, E.N., Aircraft Control and Simulation: Dynamics, Controls Design and Autonomous Systems, 3rd ed., Wiley, 2016. Stevens, B.L., Lewis, F.L., and Johnson, E.N., Aircraft Control and Simulation: Dynamics, Controls Design and Autonomous Systems, 3rd ed., Wiley, 2016.
47.
Zurück zum Zitat Cook, M.V., Flight Dynamics Principles, 2nd ed., Elsevier, 2007. Cook, M.V., Flight Dynamics Principles, 2nd ed., Elsevier, 2007.
Metadaten
Titel
Lateral Motion Control of a Maneuverable Aircraft Using Reinforcement Learning
verfasst von
Yu. V. Tiumentsev
R. A. Zarubin
Publikationsdatum
01.03.2024
Verlag
Pleiades Publishing
Erschienen in
Optical Memory and Neural Networks / Ausgabe 1/2024
Print ISSN: 1060-992X
Elektronische ISSN: 1934-7898
DOI
https://doi.org/10.3103/S1060992X2401003X

Weitere Artikel der Ausgabe 1/2024

Optical Memory and Neural Networks 1/2024 Zur Ausgabe

Premium Partner