Skip to main content
Top
Published in: Optical Memory and Neural Networks 1/2024

01-03-2024

Lateral Motion Control of a Maneuverable Aircraft Using Reinforcement Learning

Authors: Yu. V. Tiumentsev, R. A. Zarubin

Published in: Optical Memory and Neural Networks | Issue 1/2024

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Machine learning is currently one of the most actively developing research areas. Considerable attention in the ongoing research is paid to problems related to dynamical systems. One of the areas in which the application of machine learning technologies is being actively explored is aircraft of various types and purposes. This state of the art is due to the complexity and variety of tasks that are assigned to aircraft. The complicating factor in this case is incomplete and inaccurate knowledge of the properties of the object under study and the conditions in which it operates. In particular, a variety of abnormal situations may occur during flight, such as equipment failures and structural damage, which must be counteracted by reconfiguring the aircraft’s control system and controls. The aircraft control system must be able to operate effectively under these conditions by promptly changing the parameters and/or structure of the control laws used. Adaptive control methods allow to satisfy this requirement. One of the ways to synthesize control laws for dynamic systems, widely used nowadays, is LQR approach. A significant limitation of this approach is the lack of adaptability of the resulting control law, which prevents its use in conditions of incomplete and inaccurate knowledge of the properties of the control object and the environment in which it operates. To overcome this limitation, it was proposed to modify the standard variant of LQR (Linear Quadratic Regulator) based on approximate dynamic programming, a special case of which is the adaptive critic design (ACD) method. For the ACD-LQR combination, the problem of controlling the lateral motion of a maneuvering aircraft is solved. The results obtained demonstrate the promising potential of this approach to controlling the airplane motion under uncertainty conditions.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Meyn, S., Control Systems and Reinforcement Learning, Cambridge, UK: Cambridge Univ. Press, 2022.CrossRef Meyn, S., Control Systems and Reinforcement Learning, Cambridge, UK: Cambridge Univ. Press, 2022.CrossRef
2.
go back to reference Song, R., Wei, Q., and Li, Q., Adaptive Dynamic Programming: Single and Multiple Controllers, Beijing: Science Press; Singapore: Springer Nature, 2019. Song, R., Wei, Q., and Li, Q., Adaptive Dynamic Programming: Single and Multiple Controllers, Beijing: Science Press; Singapore: Springer Nature, 2019.
3.
go back to reference Zhang, Y., Li, S., and Zhou, X., Deep Reinforcement Learning with Guaranteed Performance: A Lyapunov-Based Approach, Springer Nature Switzerland, 2020.CrossRef Zhang, Y., Li, S., and Zhou, X., Deep Reinforcement Learning with Guaranteed Performance: A Lyapunov-Based Approach, Springer Nature Switzerland, 2020.CrossRef
4.
go back to reference Kamalapurkar, R., Walters, P., Rosenfeld, J., and Dixon W., Reinforcement Learning for Optimal Feedback Control: A Lyapunov-based approach, Berlin: Springer, 2018.CrossRef Kamalapurkar, R., Walters, P., Rosenfeld, J., and Dixon W., Reinforcement Learning for Optimal Feedback Control: A Lyapunov-based approach, Berlin: Springer, 2018.CrossRef
5.
go back to reference Vamvoudakis, K.G., Wan, Y., Lewis, F.L., and Cansever, D., Eds., Handbook of Reinforcement Learning and Control, Springer Nature Switzerland, 2021. Vamvoudakis, K.G., Wan, Y., Lewis, F.L., and Cansever, D., Eds., Handbook of Reinforcement Learning and Control, Springer Nature Switzerland, 2021.
6.
go back to reference Liu, D., Xue, S., Zhao, B., Luo, B., and Wei, Q., Adaptive dynamic programming for vontrol: A survey and recent advances. IEEE Trans. Syst., Man, Cybern., Part B, 2023, vol. 1, pp. 142–160. Liu, D., Xue, S., Zhao, B., Luo, B., and Wei, Q., Adaptive dynamic programming for vontrol: A survey and recent advances. IEEE Trans. Syst., Man, Cybern., Part B, 2023, vol. 1, pp. 142–160.
7.
go back to reference Wang, D., He, H., and Liu D., Adaptive critic nonlinear robust control: A survey, IEEE Trans. Cybern., 2017, vol. 47, no. 10, pp. 1–22.CrossRef Wang, D., He, H., and Liu D., Adaptive critic nonlinear robust control: A survey, IEEE Trans. Cybern., 2017, vol. 47, no. 10, pp. 1–22.CrossRef
8.
go back to reference Buşoniu, L., de Bruin, T., Tolić, D., Kober, J., and Palunko, I., Reinforcement learning for control: Performance, stability, and deep approximators, Annu. Rev. Control, 2018, vol. 46, pp. 8–28.MathSciNetCrossRef Buşoniu, L., de Bruin, T., Tolić, D., Kober, J., and Palunko, I., Reinforcement learning for control: Performance, stability, and deep approximators, Annu. Rev. Control, 2018, vol. 46, pp. 8–28.MathSciNetCrossRef
9.
go back to reference Khan, S.G., Herrmann, G., Lewis, F.L., Pipe, T., and Melhuish, C., Reinforcement learning and optimal adaptive control: An overview and implementation examples, Annu. Rev. Control, 2012, vol. 36, pp. 42–59.CrossRef Khan, S.G., Herrmann, G., Lewis, F.L., Pipe, T., and Melhuish, C., Reinforcement learning and optimal adaptive control: An overview and implementation examples, Annu. Rev. Control, 2012, vol. 36, pp. 42–59.CrossRef
10.
go back to reference Kiumarsi, B., Vamvoudakis, K.G., Modares, H., and Lewis, F.L., Optimal and autonomous control using reinforcement learning: A survey, IEEE Trans. Neural Networks Learn. Syst., 2018, vol. 29, pp. 2042–2062.MathSciNetCrossRef Kiumarsi, B., Vamvoudakis, K.G., Modares, H., and Lewis, F.L., Optimal and autonomous control using reinforcement learning: A survey, IEEE Trans. Neural Networks Learn. Syst., 2018, vol. 29, pp. 2042–2062.MathSciNetCrossRef
11.
go back to reference Kober, J., Bagnell, J.A., and Peters, J., Reinforcement learning in robotics: A survey, Int. J. Rob. Res., 2013, vol. 22, pp. 1238–1274.CrossRef Kober, J., Bagnell, J.A., and Peters, J., Reinforcement learning in robotics: A survey, Int. J. Rob. Res., 2013, vol. 22, pp. 1238–1274.CrossRef
12.
go back to reference Lewis, F.L. and Vrabie, D., Reinforcement learning and adaptive dynamic programming for feedback control, IEEE Circuits Syst. Mag., 2009, vol. 9, no. 3. pp. 32–50.CrossRef Lewis, F.L. and Vrabie, D., Reinforcement learning and adaptive dynamic programming for feedback control, IEEE Circuits Syst. Mag., 2009, vol. 9, no. 3. pp. 32–50.CrossRef
13.
go back to reference Li, Y., Deep reinforcement learning: An overview. arXiv, 2018, arXiv:1810.06339v1, pp. 1–150. Li, Y., Deep reinforcement learning: An overview. arXiv, 2018, arXiv:1810.06339v1, pp. 1–150.
14.
go back to reference Ducard, G.J.J., Fault-tolerant Flight Control and Guidance Systems: Practical Methods for Small Unmanned Aerial Vehicles; Springer: Berlin, 2009.CrossRef Ducard, G.J.J., Fault-tolerant Flight Control and Guidance Systems: Practical Methods for Small Unmanned Aerial Vehicles; Springer: Berlin, 2009.CrossRef
15.
go back to reference Hajlyev, C. and Caliskan, F., Fault Diagnosis and Reconfiguration in Flight Control Systems, Springer: Berlin, 2003.CrossRef Hajlyev, C. and Caliskan, F., Fault Diagnosis and Reconfiguration in Flight Control Systems, Springer: Berlin, 2003.CrossRef
16.
go back to reference Blanke, M., Kinnaert, M., Lunze, J., and Staroswiecki, M., Diagnosis and Fault-Tolerant Control, 2nd ed.; Springer: Berlin, 2006. Blanke, M., Kinnaert, M., Lunze, J., and Staroswiecki, M., Diagnosis and Fault-Tolerant Control, 2nd ed.; Springer: Berlin, 2006.
17.
go back to reference Noura, H., Theilliol, D., Ponsart, J.-C., and Chamseddine, A., Fault-tolerant Control Systems: Design and Practical Applications, Springer: Berlin, 2009.CrossRef Noura, H., Theilliol, D., Ponsart, J.-C., and Chamseddine, A., Fault-tolerant Control Systems: Design and Practical Applications, Springer: Berlin, 2009.CrossRef
18.
go back to reference Zhou, J., Xing, L., and Wen, C. Adaptive Control of Dynamic Systems with Uncertainty and Quantization, London, UK: CRC Press, 2021.CrossRef Zhou, J., Xing, L., and Wen, C. Adaptive Control of Dynamic Systems with Uncertainty and Quantization, London, UK: CRC Press, 2021.CrossRef
19.
go back to reference Astolfi A., Karagiannis D., and Ortega R., Nonlinear and Adaptive Control with Applications, Berlin a.o.: Springer, 2008. Astolfi A., Karagiannis D., and Ortega R., Nonlinear and Adaptive Control with Applications, Berlin a.o.: Springer, 2008.
20.
go back to reference Ioannou, P.A. and Sun, J., Robust Adaptive Control, Prentice Hall, 1995. Ioannou, P.A. and Sun, J., Robust Adaptive Control, Prentice Hall, 1995.
21.
go back to reference Mosca, E., Optimal, Predictive, and Adaptive Control, Prentice Hall, 1994. Mosca, E., Optimal, Predictive, and Adaptive Control, Prentice Hall, 1994.
23.
go back to reference Sutton, R.S. and Barto, A.G., Reinforcement Learning: An Introduction. 2nd ed., Cambridge, Massachusetts, USA: MIT Press, 2018. Sutton, R.S. and Barto, A.G., Reinforcement Learning: An Introduction. 2nd ed., Cambridge, Massachusetts, USA: MIT Press, 2018.
24.
go back to reference Wei, Q., Song, R., Li, B., and Lin, X., Self-learning Optimal Control of Nonlinear Systems: Adaptive Dynamic Programming Approach, Springer, 2018.CrossRef Wei, Q., Song, R., Li, B., and Lin, X., Self-learning Optimal Control of Nonlinear Systems: Adaptive Dynamic Programming Approach, Springer, 2018.CrossRef
25.
go back to reference Haykin, S., Neural Networks: A Comprehensive Foundation, 2nd ed., Prentice Hall, 2006. Haykin, S., Neural Networks: A Comprehensive Foundation, 2nd ed., Prentice Hall, 2006.
26.
go back to reference Powell, W.B., Approximate Dynamic Programming: Solving the Curse of Dimensionality, 2nd ed., Wiley, 2011.CrossRef Powell, W.B., Approximate Dynamic Programming: Solving the Curse of Dimensionality, 2nd ed., Wiley, 2011.CrossRef
27.
go back to reference Lewis, F.L. and Liu, D., Eds., Reinforcement Learning and Approximate Dynamic Programming for Feedback Control. Wiley, 2013. Lewis, F.L. and Liu, D., Eds., Reinforcement Learning and Approximate Dynamic Programming for Feedback Control. Wiley, 2013.
28.
go back to reference Liu, D., Xue, S., Zhao, B., Luo, B., and Wei, Q., Adaptive dynamic programming for control: A survey and recent advances, IEEE Trans. Syst., Man, Cybern., 2021, vol. 51, no. 1, pp.142–160.CrossRef Liu, D., Xue, S., Zhao, B., Luo, B., and Wei, Q., Adaptive dynamic programming for control: A survey and recent advances, IEEE Trans. Syst., Man, Cybern., 2021, vol. 51, no. 1, pp.142–160.CrossRef
29.
go back to reference Liu, D., Wei, Q., Wang, D., Yang, X., and Li, H., Adaptive Dynamic Programming with Applications in Optimal Control, Springer, 2017.CrossRef Liu, D., Wei, Q., Wang, D., Yang, X., and Li, H., Adaptive Dynamic Programming with Applications in Optimal Control, Springer, 2017.CrossRef
30.
go back to reference Ferrari, S., Stengel, R.F., Online adaptive critic flight control, J. Guid., Control, Dyn., 2004, vol. 27, no. 5, pp. 777–786.CrossRef Ferrari, S., Stengel, R.F., Online adaptive critic flight control, J. Guid., Control, Dyn., 2004, vol. 27, no. 5, pp. 777–786.CrossRef
31.
go back to reference Wang, D. and Mu, C., Adaptive Critic Control with Robust Stabilization for Uncertain Nonlinear Systems, Springer, 2019.CrossRef Wang, D. and Mu, C., Adaptive Critic Control with Robust Stabilization for Uncertain Nonlinear Systems, Springer, 2019.CrossRef
32.
go back to reference Lewis, F.L., Vrabie, D.L., and Syrmos, V.L. Optimal Control, 3rd ed., Hoboken, New Jersey: Wiley, 2012.CrossRef Lewis, F.L., Vrabie, D.L., and Syrmos, V.L. Optimal Control, 3rd ed., Hoboken, New Jersey: Wiley, 2012.CrossRef
33.
go back to reference Rugh, W.J. and Shamma J.S., Research on gain scheduling: Survey paper, Automatica, 2000, vol. 36, no. 10, pp.1401–1425.MathSciNetCrossRef Rugh, W.J. and Shamma J.S., Research on gain scheduling: Survey paper, Automatica, 2000, vol. 36, no. 10, pp.1401–1425.MathSciNetCrossRef
34.
go back to reference Leith, D.J. and Leithead W.E., Survey of gain scheduling analysis and design, Int. J. Control, 2000, vol. 73, no. 11, pp. 1001–1025.MathSciNetCrossRef Leith, D.J. and Leithead W.E., Survey of gain scheduling analysis and design, Int. J. Control, 2000, vol. 73, no. 11, pp. 1001–1025.MathSciNetCrossRef
35.
go back to reference Enns, D., Bugajski, D., Hendrick, R., and Stein G., Dynamic inversion: an evolving methodology for flight control design, Int. J. Control, 1994, vol. 59, no. 1, pp. 71–91.CrossRef Enns, D., Bugajski, D., Hendrick, R., and Stein G., Dynamic inversion: an evolving methodology for flight control design, Int. J. Control, 1994, vol. 59, no. 1, pp. 71–91.CrossRef
36.
go back to reference Looye, G., Design of robust autopilot control laws with nonlinear dynamic inversion, Automatisierungstechnik, 2001, vol. 49, no. 12, pp. 523–531.CrossRef Looye, G., Design of robust autopilot control laws with nonlinear dynamic inversion, Automatisierungstechnik, 2001, vol. 49, no. 12, pp. 523–531.CrossRef
37.
go back to reference Werbos, P.J., A menu of designs for reinforcement learning over time, in Neural Networks for Control, Miller, W.T., Sutton, R.S., and Werbos, P.J., Eds., Cambridge, MA: MIT Press, 1990, pp. 67–95. Werbos, P.J., A menu of designs for reinforcement learning over time, in Neural Networks for Control, Miller, W.T., Sutton, R.S., and Werbos, P.J., Eds., Cambridge, MA: MIT Press, 1990, pp. 67–95.
38.
go back to reference Vamvoudakis, K.G. and Lewis, F.L., Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem, Automatica, 2010, vol. 46, pp. 878–888.MathSciNetCrossRef Vamvoudakis, K.G. and Lewis, F.L., Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem, Automatica, 2010, vol. 46, pp. 878–888.MathSciNetCrossRef
39.
go back to reference Wang, D. and Mu, C., Adaptive Critic Control with Robust Stabilization for Uncertain Nonlinear Systems; Singapore: Springer Nature, 2019.CrossRef Wang, D. and Mu, C., Adaptive Critic Control with Robust Stabilization for Uncertain Nonlinear Systems; Singapore: Springer Nature, 2019.CrossRef
40.
go back to reference Bradtke, S.J., Reinforcement learning applied to linear quadratic regulation, Proc. NIPS-92, 1992, pp. 295–302. Bradtke, S.J., Reinforcement learning applied to linear quadratic regulation, Proc. NIPS-92, 1992, pp. 295–302.
41.
go back to reference Faradonbeh, M.K.S., Tewari, A., and Michailidis, G., On adaptive linear-quadratic regulators, Automatica, 2020, vol. 117, pp. 1–13.MathSciNet Faradonbeh, M.K.S., Tewari, A., and Michailidis, G., On adaptive linear-quadratic regulators, Automatica, 2020, vol. 117, pp. 1–13.MathSciNet
42.
go back to reference Lee, J.Y., Park, J.B., and Choi Y.H., Integral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems, Automatica, 2012, vol. 48, pp. 2850–2859.MathSciNetCrossRef Lee, J.Y., Park, J.B., and Choi Y.H., Integral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems, Automatica, 2012, vol. 48, pp. 2850–2859.MathSciNetCrossRef
43.
go back to reference Lee, J.Y., Park, J.B., and Choi, Y.H., On integral generalized policy iteration for continuous-time linear quadratic regulations, Automatica, 2014, vol. 50, pp. 475–489.MathSciNetCrossRef Lee, J.Y., Park, J.B., and Choi, Y.H., On integral generalized policy iteration for continuous-time linear quadratic regulations, Automatica, 2014, vol. 50, pp. 475–489.MathSciNetCrossRef
44.
go back to reference Nguyen, L.T., Ogburn, M.E., Gilbert, W.P., Kibler, K.S., Brown, P.W., and Deal, P.L., Simulator study of stall/post-stall characteristics of a fighter airplane with relaxed longitudinal static stability, NASA TP-1538, 1979. Nguyen, L.T., Ogburn, M.E., Gilbert, W.P., Kibler, K.S., Brown, P.W., and Deal, P.L., Simulator study of stall/post-stall characteristics of a fighter airplane with relaxed longitudinal static stability, NASA TP-1538, 1979.
45.
go back to reference Chulin, M.A., Tiumentsev, Yu.V., and Zarubin, R.A., LQR approach to aircraft control based on the adaptive critic design, Stud. Comput. Intell., 2023, vol. 1120, pp. 406–419.CrossRef Chulin, M.A., Tiumentsev, Yu.V., and Zarubin, R.A., LQR approach to aircraft control based on the adaptive critic design, Stud. Comput. Intell., 2023, vol. 1120, pp. 406–419.CrossRef
46.
go back to reference Stevens, B.L., Lewis, F.L., and Johnson, E.N., Aircraft Control and Simulation: Dynamics, Controls Design and Autonomous Systems, 3rd ed., Wiley, 2016. Stevens, B.L., Lewis, F.L., and Johnson, E.N., Aircraft Control and Simulation: Dynamics, Controls Design and Autonomous Systems, 3rd ed., Wiley, 2016.
47.
go back to reference Cook, M.V., Flight Dynamics Principles, 2nd ed., Elsevier, 2007. Cook, M.V., Flight Dynamics Principles, 2nd ed., Elsevier, 2007.
Metadata
Title
Lateral Motion Control of a Maneuverable Aircraft Using Reinforcement Learning
Authors
Yu. V. Tiumentsev
R. A. Zarubin
Publication date
01-03-2024
Publisher
Pleiades Publishing
Published in
Optical Memory and Neural Networks / Issue 1/2024
Print ISSN: 1060-992X
Electronic ISSN: 1934-7898
DOI
https://doi.org/10.3103/S1060992X2401003X

Other articles of this Issue 1/2024

Optical Memory and Neural Networks 1/2024 Go to the issue

Premium Partner