Top

Neural Computing and Applications

26-04-2024 | Original Article

Air combat maneuver decision based on deep reinforcement learning with auxiliary reward

Authors: Tingyu Zhang, Yongshuai Wang, Mingwei Sun, Zengqiang Chen

Published in: Neural Computing and Applications

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

For air combat maneuvering decision, the sparse reward during the application of deep reinforcement learning limits the exploration efficiency of the agents. To address this challenge, we propose an auxiliary reward function considering the impact of angle, range, and altitude. Furthermore, we investigate the influences of the network nodes, layers, and the learning rate on decision system, and reasonable parameter ranges are provided, which can serve as a guideline. Finally, four typical air combat scenarios demonstrate good adaptability and effectiveness of the proposed scheme, and the auxiliary reward significantly improves the learning ability of deep Q network (DQN) by leading the agents to explore more intently. Compared with the original deep deterministic policy gradient and soft actor critic algorithm, the proposed method exhibits superior exploration capability with higher reward, indicating that the trained agent can adapt to different air combats with good performance.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Available only for authorised users

Alpdemir MN (2022) Tactical UAV path optimization under radar threat using deep reinforcement learning. Neural Comput Appl 34:5649–5664CrossRef

Liu H, Meng Q, Peng F, Lewis FL (2020) Heterogeneous formation control of multiple UAVs with limited-input leader via reinforcement learning. Neurocomputing 412:63–71CrossRef

Zhou K, Wei R, Xu Z (2020) An air combat decision learning system based on a brain-like cognitive mechanism. Cogn Comput 12:128–139CrossRef

Trotta A, Felice MD, Montori F, Chowdhury KR, Bononi L (2018) Joint coverage, connectivity, and charging strategies for distributed UAV networks. IEEE Trans Robot 34:883–900CrossRef

Sun Z, Wu H, Shi Y, Yu X, Gao Y, Pei W, Yang Z, Piao H, Hou Y (2023) Multi-agent air combat with two-stage graph-attention communication. Neural Comput Appl 35:19765–19781CrossRef

Shin H, Lee J, Kim H, Hyunchul Shim D (2018) An autonomous aerial combat framework for two-on-two engagements based on basic fighter maneuvers. Aerosp Sci Technol 72:305–315CrossRef

Maravall Lope J, Fuentes JP (2015) Vision-based anticipatory controller for the autonomous navigation of an UAV using artificial neural networks. Neurocomputing 151:101–107CrossRef

Dai X, Mao Y, Huang T (2020) Automatic obstacle avoidance of quadrotor UAV via CNN-based learning. Neurocomputing 402:346–358CrossRef

Wang M, Wang L, Yue T, Liu H (2020) Influence of unmanned combat aerial vehicle agility on short-range aerial combat effectiveness. Aerosp Sci Technol 96:105534CrossRef

10.

Zhou K, Wei R, Xu Z, Zhang Q (2018) (2018) A brain like air combat learning system inspired by human learning mechanism. In: Proceedings of IEEE CSAA guidance, navigation and control conference (CGNCC). IEEE, Xiamen, pp 1–6

11.

Wang X, Guo K, Chao T, Wang S (2022) Design of differential game guidance law for dual defense aircrafts. In: Proceedings of 2022 5th international symposium on autonomous systems (ISAS). IEEE, Hangzhou, pp 1–6

12.

Weintraub IE, Pachter M, Garcia E (2020) (2020) An introduction to pursuit-evasion differential games. In: Proceedings of American control conference (ACC). IEEE, Denver, pp 1049–1066

13.

Ruan W, Sun Y, Deng Y, Duan H (2023) Hawk-pigeon game tactics for unmanned aerial vehicle swarm target defense. IEEE Trans Ind Inform 19:11619–11629CrossRef

14.

Ma Y, Wang G, Hu X, Luo H, Lei X (2020) Cooperative occupancy decision making of multi-UAV in beyond-visual-range air combat: a game theory approach. IEEE Access 8:11624–11634CrossRef

15.

Kang Y, Pu Z, Liu Z (2020) (2020) Air-to-air combat tactical decision method based on SIRMs fuzzy logic and improved genetic algorithm. In: Proceedings of international conference on guidance, navigation and control (ICGNC). Springer, Tianjin, pp 3699–3709

16.

Crumpacker JB, Robbins MJ, Jenkins PR (2022) An approximate dynamic programming approach for solving an air combat maneuvering problem. Expert Syst Appl 203:117448CrossRef

17.

Sharma R (2014) (2014) Fuzzy Q learning based UAV autopilot. In: Proceedings of innovative applications of computational intelligence on power, energy and controls with their impact on humanity (CIPECH). IEEE, Ghaziabad, pp 29–33

18.

Liu Y, Liu W, Obaid MA, Abbas IA (2016) Exponential stability of Markovian jumping Cohen–Grossberg neural networks with mixed mode-dependent time-delays. Neurocomputing 177:409–415CrossRef

19.

Du B, Liu Y, Atiatallah Abbas I (2016) Existence and asymptotic behavior results of periodic solution for discrete-time neutral-type neural networks. J Frankl Inst 353:448–461MathSciNetCrossRef

20.

Emuna R, Duffney R, Borowsky A, Biess A (2022) Example-guided learning of stochastic human driving policies using deep reinforcement learning. Neural Comput Appl 35:16791–16804CrossRef

21.

Kiani F, Saraç ÖF (2023) A novel intelligent traffic recovery model for emergency vehicles based on context-aware reinforcement learning. Inf Sci 619:288–309CrossRef

22.

Damadam S, Zourbakhsh M, Javidan R, Faroughi A (2022) An intelligent IoT based traffic light management system: deep reinforcement learning. Smart Cities 5:1293–1311CrossRef

23.

Zhu R, Li L, Wu S, Lv P, Li Y, Xu M (2023) Multi-agent broad reinforcement learning for intelligent traffic light control. Inf Sci 619:509–525CrossRef

24.

Du G, Zou Y, Zhang X, Liu T, Wu J, He D (2020) Deep reinforcement learning based energy management for a hybrid electric vehicle. Energy 201:117591CrossRef

25.

Yang D, Karimi HR, Pawelczyk M (2023) A new intelligent fault diagnosis framework for rotating machinery based on deep transfer reinforcement learning. Control Eng Pract 134:105475CrossRef

26.

Liu Q, Shi L, Sun L, Li J, Ding M, Shu FS (2020) Path planning for UAV-mounted mobile edge computing with deep reinforcement learning. IEEE Trans Veh Technol 69:5723–5728CrossRef

27.

Hoel C-J, Driggs-Campbell K, Wolff K, Laine L, Kochenderfer MJ (2020) Combining planning and deep reinforcement learning in tactical decision making for autonomous driving. IEEE Trans Intell Veh 5:294–305CrossRef

28.

Leong AS, Ramaswamy A, Quevedo DE, Karl H (2020) Deep reinforcement learning for wireless sensor scheduling in cyber-physical system. Automatic 113:108759MathSciNetCrossRef

29.

Liessner R, Schmitt J, Dietermann A, Bäker B (2019) Hyperparameter optimization for deep reinforcement learning in vehicle energy management. In: Proceedings of 11th international conference on agents artificial intelligence SCITEPRESS—science and technology publications, Prague, pp 134–144

30.

Chen Y, Zhang J, Yang Q, Zhou Y, Shi G, Wu Y (2020) Design and verification of UAV maneuver decision Simulation system based on deep Q-learning network. In: Proceedings of 2020 16th international conference on control, automation, robotics and vision (ICARCV). IEEE, Shenzhen, pp 817–823

31.

Cao Y, Kou Y-X, Li Z-W, Xu A (2023) Autonomous maneuver decision of UCAV air combat based on double deep Q network algorithm and stochastic game theory. Int J Aerosp Eng 2023:1–20CrossRef

32.

Zhang J, Yu Y, Zheng L, Yang Q, Shi G, Wu Y (2023) Situational continuity-based air combat autonomous maneuvering decision-making. Def Technol 29:66–79CrossRef

33.

Yang Q, Zhu Y, Zhang J, Qiao S, Liu J (2019) UAV air combat autonomous maneuver decision based on DDPG algorithm. In: 2019 IEEE 15th international conference on control automation. ICCA. IEEE, Edinburgh, pp 37–42

34.

Zhang J, Yang Q, Shi G (2021) UAV cooperative air combat maneuver decision based on multi-agent reinforcement learning. J Syst Eng Electron 32:1421–1438CrossRef

35.

Wang Z, Guo Y, Li N, Hu S, Wang M (2023) Autonomous collaborative combat strategy of unmanned system group in continuous dynamic environment based on PD-MADDPG. Comput Commun 200:182–204CrossRef

36.

Li L, Zhang X, Qian C et al (2023) Basic flight maneuver generation of fixed-wing plane based on proximal policy optimization. Neural Comput Appl 2023:1–17

37.

Wang Z, Li H, Wu Z, Wu H (2021) A pretrained proximal policy optimization algorithm with reward shaping for aircraft guidance to a moving destination in three-dimensional continuous space. Int J Adv Robot Syst 18:172988142198954CrossRef

38.

Liu X, Yin Y, Su Y, Ming R (2022) A multi-UCAV cooperative decision-making method based on an MAPPO algorithm for beyond-visual-range air combat. Aerospace 9:563–582CrossRef

39.

Xu J, Zhang J, Yang L, Liu C (2022) Autonomous decision-making for dogfights based on a tactical pursuit point approach. Aerosp Sci Technol 129:107857CrossRef

40.

Li B, Bai S, Liang S, Ma R, Neretin E, Huang J (2023) Manoeuvre decision-making of unmanned aerial vehicles in air combat based on an expert actor-based soft actor critic algorithm. CAAI Trans Intell Technol 8:1608–1619CrossRef

41.

Li B, Huang J, Bai S, Gan Z, Liang S, Evgeny N, Yao S (2023) Autonomous air combat decision-making of UAV based on parallel self-play reinforcement learning. CAAI Trans Intell Technol 8:64–81CrossRef

42.

Huang C, Dong K, Huang H, Tang S (2018) Autonomous air combat maneuver decision using Bayesian inference and moving horizon optimization. J Syst Eng Electron 29:86–97CrossRef

43.

Johnson J (2023) Automating the OODA loop in the age of intelligent machines: reaffirming the role of humans in command-and-control decision-making in the digital age. Def Stud 23:43–67CrossRef

44.

Wang LX, Guo YG, Zhang Q, Yue T (2017) Suggestion for aircraft flying qualities requirements of a short-range air combat mission. Chin J Aeronaut 30:881–897CrossRef

45.

Li Y, Lyu Y, Shi J, Li W (2022) Autonomous maneuver decision of air combat based on simulated operation command and FRV-DDPG algorithm. Aerospace 9:658–676CrossRef

46.

Austin F, Carbone G, Falco M, Hinz H, Lewis M (1987) Automated maneuvering decisions for air-to-air combat. In: Guidance, navigation and control conference, pp 2393

Title: Air combat maneuver decision based on deep reinforcement learning with auxiliary reward
Authors: Tingyu Zhang
Yongshuai Wang
Mingwei Sun
Zengqiang Chen
Publication date: 26-04-2024
Publisher: Springer London
Published in: Neural Computing and Applications
Print ISSN: 0941-0643
Electronic ISSN: 1433-3058
DOI: https://doi.org/10.1007/s00521-024-09720-z

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Springer Professional "Wirtschaft+Technik"

Premium Partner