Skip to main content
Top
Published in: Artificial Life and Robotics 2/2022

04-05-2022 | Original Article

Crafting a robotic swarm pursuit–evasion capture strategy using deep reinforcement learning

Authors: Charles H. Wu, Donald A. Sofge, Daniel M. Lofaro

Published in: Artificial Life and Robotics | Issue 2/2022

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this paper we study the multi-agent pursuit–evasion problem, and present an extension of the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) deep reinforcement learning algorithm. Previous pursuit–evasion advancements with MADDPG have focused on training capture strategies dependent on the restriction of evader movement with environmental features. We demonstrate a method to train pursuer agents to collaboratively surround and encircle an evader for reliable capture without a strategy rooted in environment entrapment (i.e. cornering). Our method utilizes a novel two-stage, variable-aggression, continuous reward function based on geometrical inscribed circles (incircles), along with a corresponding observation space, with agents operating in an entrapment-disadvantaged environment. Our results show reliable capture of an intelligent, superior evader by three trained pursuers in open space with our encircling strategy. A key novelty of our work is demonstrating the ability to transition behaviors learned using deep reinforcement learning from a simulated robotic system with imperfect world assumptions to a real-world robotic agents.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Andreen D, Jenning P, Napp N, Petersen K (2016) Emergent structures assembled by large swarms of simple robots. In: Posthuman frontiers Andreen D, Jenning P, Napp N, Petersen K (2016) Emergent structures assembled by large swarms of simple robots. In: Posthuman frontiers
2.
go back to reference Awheda MD, Schwartz HM (2016) A decentralized fuzzy learning algorithm for pursuit–evasion differential games with superior evaders. J Intell Robot Syst 83:35–53CrossRef Awheda MD, Schwartz HM (2016) A decentralized fuzzy learning algorithm for pursuit–evasion differential games with superior evaders. J Intell Robot Syst 83:35–53CrossRef
3.
go back to reference DeMarco K, Squires E, Day M, Pippin C (2018) Simulating collaborative robots in a massive multi-agent game environment (SCRIMMAGE). In: International symposium on distributed autonomous robotic systems DeMarco K, Squires E, Day M, Pippin C (2018) Simulating collaborative robots in a massive multi-agent game environment (SCRIMMAGE). In: International symposium on distributed autonomous robotic systems
4.
go back to reference Guadarrama S, Korattikara A, Ramirez O, Castro P, Holly E, Fishman S, Wang K, Gonina E, Wu N, Kokiopoulou E, Sbaiz L, Smith J, Bartók G, Berent J, Harris C, Vanhoucke V, Brevdo E (2018) TF-Agents: a library for reinforcement learning in tensorflow . https://github.com/tensorflow/agents Guadarrama S, Korattikara A, Ramirez O, Castro P, Holly E, Fishman S, Wang K, Gonina E, Wu N, Kokiopoulou E, Sbaiz L, Smith J, Bartók G, Berent J, Harris C, Vanhoucke V, Brevdo E (2018) TF-Agents: a library for reinforcement learning in tensorflow . https://​github.​com/​tensorflow/​agents
5.
go back to reference Hüttenrauch M, Sosic A, Neumann G (2018) Deep reinforcement learning for swarm systems. CoRR Hüttenrauch M, Sosic A, Neumann G (2018) Deep reinforcement learning for swarm systems. CoRR
6.
go back to reference Lillicrap T.P, Hunt J.J, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2016) Continuous control with deep reinforcement learning. CoRR Lillicrap T.P, Hunt J.J, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2016) Continuous control with deep reinforcement learning. CoRR
7.
go back to reference Lowe R, Wu Y, Tamar A, Harb J, Abbeel P, Mordatch I (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. In: Proceedings of the 31st international conference on neural information processing systems, NIPS’17, p 6382–6393. Curran Associates Inc., Red Hook, NY, USA Lowe R, Wu Y, Tamar A, Harb J, Abbeel P, Mordatch I (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. In: Proceedings of the 31st international conference on neural information processing systems, NIPS’17, p 6382–6393. Curran Associates Inc., Red Hook, NY, USA
8.
go back to reference Mao H, Zhang Z, Xiao Z, Gong Z (2018) Modelling the dynamic joint policy of teammates with attention multi-agent DDPG. CoRR Mao H, Zhang Z, Xiao Z, Gong Z (2018) Modelling the dynamic joint policy of teammates with attention multi-agent DDPG. CoRR
9.
go back to reference Mnih V, Kavukcuoglu K, Silver D, Rusu A, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland A, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D (2015) Human-level control through deep reinforcement learning. Nature 518:529–533CrossRef Mnih V, Kavukcuoglu K, Silver D, Rusu A, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland A, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D (2015) Human-level control through deep reinforcement learning. Nature 518:529–533CrossRef
10.
go back to reference Rycroft C (2009) Voro++: a three-dimensional voronoi cell library in c++. Chaos Interdiscip J Nonlinear Sci Rycroft C (2009) Voro++: a three-dimensional voronoi cell library in c++. Chaos Interdiscip J Nonlinear Sci
11.
go back to reference Sheikh H.U, Bölöni L (2019) Designing a multi-objective reward function for creating teams of robotic bodyguards using deep reinforcement learning. ArXiv Sheikh H.U, Bölöni L (2019) Designing a multi-objective reward function for creating teams of robotic bodyguards using deep reinforcement learning. ArXiv
12.
go back to reference Silver D, Lever G, Heess N, Degris T, Wierstra D, Riedmiller M (2014) Deterministic policy gradient algorithms. In: ICML Silver D, Lever G, Heess N, Degris T, Wierstra D, Riedmiller M (2014) Deterministic policy gradient algorithms. In: ICML
13.
go back to reference Singh G, Lofaro D, Sofge D (2020) Pursuit-evasion with decentralized robotic swarm in continuous state space and action space via deep reinforcement learning. In: Proceedings of the 12th international conference on agents and artificial intelligence, vol 1, ICAART, p 226–233. INSTICC, SciTePress Singh G, Lofaro D, Sofge D (2020) Pursuit-evasion with decentralized robotic swarm in continuous state space and action space via deep reinforcement learning. In: Proceedings of the 12th international conference on agents and artificial intelligence, vol 1, ICAART, p 226–233. INSTICC, SciTePress
14.
go back to reference Wang J, Olson E (2016) Apriltag 2: efficient and robust fiducial detection. In: 2016 IEEE/RSJ International conference on intelligent robots and systems (IROS) Wang J, Olson E (2016) Apriltag 2: efficient and robust fiducial detection. In: 2016 IEEE/RSJ International conference on intelligent robots and systems (IROS)
15.
go back to reference Wang X, Cruz J, Chen G, Pham K, Blasch E (2007) Formation control in multi-player pursuit evasion game with superior evaders. In: Proceedings of SPIE—The International Society for Optical Engineering Wang X, Cruz J, Chen G, Pham K, Blasch E (2007) Formation control in multi-player pursuit evasion game with superior evaders. In: Proceedings of SPIE—The International Society for Optical Engineering
16.
go back to reference Weintraub I.E, Pachter M, Garcia E (2020) An introduction to pursuit-evasion differential games. In: 2020 American Control Conference (ACC), pp 1049–1066 Weintraub I.E, Pachter M, Garcia E (2020) An introduction to pursuit-evasion differential games. In: 2020 American Control Conference (ACC), pp 1049–1066
17.
go back to reference Wu C, Lofaro D, Sofge D (2021) A Learned Encircling Strategy for Robot Swarm Pursuit-Evasion Against a Superior Evader. In: The 4th International Symposium on Swarm Behavior and Bio-Inspired Robotics Wu C, Lofaro D, Sofge D (2021) A Learned Encircling Strategy for Robot Swarm Pursuit-Evasion Against a Superior Evader. In: The 4th International Symposium on Swarm Behavior and Bio-Inspired Robotics
18.
go back to reference Wu C, Lofaro D, Sofge D (2021) A learned encircling strategy for robot swarm pursuit-evasion against a superior evader. In: The 15th International Symposium on Distributed Autonomous Robotic Systems (DARS) Wu C, Lofaro D, Sofge D (2021) A learned encircling strategy for robot swarm pursuit-evasion against a superior evader. In: The 15th International Symposium on Distributed Autonomous Robotic Systems (DARS)
Metadata
Title
Crafting a robotic swarm pursuit–evasion capture strategy using deep reinforcement learning
Authors
Charles H. Wu
Donald A. Sofge
Daniel M. Lofaro
Publication date
04-05-2022
Publisher
Springer Japan
Published in
Artificial Life and Robotics / Issue 2/2022
Print ISSN: 1433-5298
Electronic ISSN: 1614-7456
DOI
https://doi.org/10.1007/s10015-022-00761-y

Other articles of this Issue 2/2022

Artificial Life and Robotics 2/2022 Go to the issue