Top

Published in:

2021 | OriginalPaper | Chapter

10. Apprenticeship Bootstrapping Reinforcement Learning for Sky Shepherding of a Ground Swarm in Gazebo

Authors : Hung Nguyen, Matthew Garratt, Hussein A. Abbass

Published in: Shepherding UxVs for Human-Swarm Teaming

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

The coordination of unmanned air–ground vehicles has been an active area due to the significant advantages of this coordination wherein unmanned air vehicles (UAVs) have a wide field of view, enabling them to effectively guide a swarm of unmanned ground vehicles (UGVs). Due to significant recent advances in artificial intelligence (AI), autonomous agents are being used to design more robust coordination of air–ground systems, reducing the intervention load of human operators and increasing the autonomy of unmanned air–ground systems. A guidance and control shepherding system design allows for single learning agent to influence and manage a larger swarm of rule-based entities. In this chapter, we present a learning algorithm for a sky shepherd-guiding rule-based AI-driven UGVs. The apprenticeship bootstrapping learning algorithm is introduced and is applied to the aerial shepherding task.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Sky Shepherds: A Tale of a UAV and Sheep

next chapter Logical Shepherd Assisting Air Traffic Controllers for Swarm UAV Traffic Control Systems

Available only for authorised users

Abbeel, P., Ng, A.Y.: Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the Twenty-First International Conference on Machine Learning, p. 1. ACM, New York (2004)

Abbeel, P., Coates, A., Ng, A.Y.: Autonomous helicopter aerobatics through apprenticeship learning. Int. J. Rob. Res. 29(13), 1608–1639 (2010)CrossRef

Aghaeeyan, A., Abdollahi, F., Talebi, H.A.: UAV–UGVs cooperation: with a moving center based trajectory. Rob. Auton. Syst. 63, 1–9 (2015)CrossRef

Argall, B.D., Chernova, S., Veloso, M., Browning, B.: A survey of robot learning from demonstration. Rob. Auton. Syst. 57(5), 469–483 (2009)CrossRef

Arora, S., Doshi, P.: A survey of inverse reinforcement learning: Challenges, methods and progress (2018). Preprint arXiv:1806.06877

Balch, T., Arkin, R.C.: Behavior-based formation control for multirobot teams. IEEE Trans. Robot. Autom. 14(6), 926–939 (1998)CrossRef

Baumann, M., Büning, H.K.: Learning shepherding behavior. Ph.D. Thesis, University of Paderborn (2016)

Baxter, J.L., Burke, E., Garibaldi, J.M., Norman, M.: Multi-robot search and rescue: A potential field based approach. In: Autonomous Robots and Agents, pp. 9–16. Springer, Berlin (2007)

Beard, R.W., Lawton, J., Hadaegh, F.Y.: A coordination architecture for spacecraft formation control. IEEE Trans. Control Syst. Technol. 9(6), 777–790 (2001)CrossRef

10.

Bentivegna, D.C., Atkeson, C.G., Cheng, G.: Learning tasks from observation and practice. Rob. Auton. Syst. 47(2–3), 163–169 (2004)CrossRef

11.

Billard, A.G., Calinon, S., Dillmann, R.: Learning from humans. In: Springer Handbook of Robotics, pp. 1995–2014. Springer, Berlin (2016)

12.

Billing, E.A., Hellström, T.: A formalism for learning from demonstration. Paladyn 1(1), 1–13 (2010)

13.

Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., Jackel, L.D., Monfort, M., Muller, U., Zhang, J., et al.: End to end learning for self-driving cars (2016). Preprint arXiv:1604.07316

14.

Bojarski, M., Yeres, P., Choromanska, A., Choromanski, K., Firner, B., Jackel, L., Muller, U.: Explaining how a deep neural network trained with end-to-end learning steers a car (2017). Preprint arXiv:1704.07911

15.

Carelli, R., De la Cruz, C., Roberti, F.: Centralized formation control of non-holonomic mobile robots. Lat. Am. Appl. Res. 36(2), 63–69 (2006)

16.

Carrio, A., Sampedro, C., Rodriguez-Ramos, A., Campoy, P.: A review of deep learning methods and applications for unmanned aerial vehicles. J. Sensors 2017, 3296874 (2017)

17.

Chaimowicz, L., Kumar, V.: Aerial shepherds: Coordination among UAVS and swarms of robots. In: Alami, R., Chatila, R., Asama, H. (eds.) Distributed Autonomous Robotic Systems, vol. 6, pp. 243–252. Springer Japan, Tokyo (2007)CrossRef

18.

Chen, J., Zhang, X., Xin, B., Fang, H.: Coordination between unmanned aerial and ground vehicles: A taxonomy and optimization perspective. IEEE Trans. Cybern. 46(4), 959–972 (2016)CrossRef

19.

Chollet, F.: Keras: Theano-based deep learning library. Code: https://github.com/fchollet. Documentation: http://keras. IO (2015)

20.

ClearpathRobotics: ROS husky robot. ROS package at http://wiki.ros.org/Robots/ Husky (2017)

21.

Daniel, C., Neumann, G., Peters, J.: Learning concurrent motor skills in versatile solution spaces. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3591–3597. IEEE, Piscataway (2012)

22.

Daw, n.d., Niv, Y., Dayan, P.: Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8(12), 1704–1711 (2005)

23.

Dillmann, R.: Teaching and learning of robot tasks via observation of human performance. Rob. Auton. Syst. 47(2–3), 109–116 (2004)CrossRef

24.

Duan, H., Li, P.: Bio-Inspired Computation in Unmanned Aerial Vehicles. Springer, Berlin (2014)CrossRef

25.

Dunk, I., Abbass, H.: Emergence of order in leader-follower boids-inspired systems. In: 2016 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1–8. IEEE, Piscataway (2016)

26.

Farinelli, A., Iocchi, L., Nardi, D.: Multirobot systems: a classification focused on coordination. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 34(5), 2015–2028 (2004)

27.

Fernandez-Rojas, R., Perry, A., Singh, H., Campbell, B., Elsayed, S., Hunjet, R., Abbass, H.A.: Contextual awareness in human-advanced-vehicle systems: A survey. IEEE Access 7, 33304–33328 (2019)CrossRef

28.

Fraser, B., Hunjet, R.: Data ferrying in tactical networks using swarm intelligence and stigmergic coordination. In: 2016 26th International Telecommunication Networks and Applications Conference (ITNAC), pp. 1–6. IEEE, Piscataway (2016)

29.

Gee, A., Abbass, H.: Transparent machine education of neural networks for swarm shepherding using curriculum design. In: Proceedings of the International Joint Conference on Neural Networks (2019)

30.

Glavic, M., Fonteneau, R., Ernst, D.: Reinforcement learning for electric power system decision and control: Past considerations and perspectives. IFAC-PapersOnLine 50(1), 6918–6927 (2017)CrossRef

31.

Graves, A., Mohamed, A.r., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6645–6649. IEEE, Piscataway (2013)

32.

Grollman, D.H., Jenkins, O.C.: Incremental learning of subtasks from unsegmented demonstration. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 261–266. IEEE, Piscataway (2010)

33.

Grounds, M., Kudenko, D.: Parallel reinforcement learning with linear function approximation. In: Adaptive Agents and Multi-Agent Systems III. Adaptation and Multi-Agent Learning, pp. 60–74. Springer, Berlin (2008)

34.

Guenter, F., Hersch, M., Calinon, S., Billard, A.: Reinforcement learning for imitating constrained reaching movements. Adv. Rob. 21(13), 1521–1544 (2007)CrossRef

35.

Guillet, A., Lenain, R., Thuilot, B., Rousseau, V.: Formation control of agricultural mobile robots: A bidirectional weighted constraints approach. J. Field Rob. 34, 1260–1274 (2017)CrossRef

36.

Guo, X., Denman, S., Fookes, C., Mejias, L., Sridharan, S.: Automatic UAV forced landing site detection using machine learning. In: 2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA), pp. 1–7. IEEE, Piscataway (2014)

37.

Gupta, J.K., Egorov, M., Kochenderfer, M.: Cooperative multi-agent control using deep reinforcement learning. In: International Conference on Autonomous Agents and Multiagent Systems, pp. 66–83. Springer, Berlin (2017)

38.

Howitt, S., Richards, D.: The human machine interface for airborne control of UAVS. In: 2nd AIAA “Unmanned Unlimited” Conference and Workshop & Exhibit, p. 6593 (2003)

39.

Huang, H., Sturm, J.: Tum simulator. ROS package at http://wiki.ros.org/tum_simulator (2014)

40.

Hudjakov, R., Tamre, M.: Aerial imagery terrain classification for long-range autonomous navigation. In: 2009 International Symposium on Optomechatronic Technologies, pp. 88–91. IEEE, Piscataway (2009)

41.

Hunjet, R., Stevens, T., Elliot, M., Fraser, B., George, P.: Survivable communications and autonomous delivery service a generic swarming framework enabling communications in contested environments. In: MILCOM 2017–2017 IEEE Military Communications Conference (MILCOM), pp. 788–793. IEEE, Piscataway (2017)

42.

Hussein, A., Gaber, M.M., Elyan, E., Jayne, C.: Imitation learning: A survey of learning methods. ACM Comput. Surv. (CSUR) 50(2), 21 (2017)

43.

Hwang, Y.K., Choi, K.J., Hong, D.S.: Self-learning control of cooperative motion for a humanoid robot. In: Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006, pp. 475–480. IEEE, Piscataway (2006)

44.

Iima, H., Kuroe, Y.: Swarm reinforcement learning method for a multi-robot formation problem. In: 2013 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 2298–2303. IEEE, Piscataway (2013)

45.

Jansen, B., Belpaeme, T.: A computational model of intention reading in imitation. Rob. Auton. Syst. 54(5), 394–402 (2006)CrossRef

46.

Justesen, N., Risi, S.: Learning macromanagement in starcraft from replays using deep learning. In: 2017 IEEE Conference on Computational Intelligence and Games (CIG), pp. 162–169. IEEE, Piscataway (2017)

47.

Khaleghi, A.M., Xu, D., Minaeian, S., Li, M., Yuan, Y., Liu, J., Son, Y.J., Vo, C., Lien, J.M.: A dddams-based UAV and UGV team formation approach for surveillance and crowd control. In: Proceedings of the 2014 Winter Simulation Conference, pp. 2907–2918. IEEE Press, Piscataway (2014)

48.

Khaleghi, A.M., Xu, D., Minaeian, S., Li, M., Yuan, Y., Liu, J., Son, Y.J., Vo, C., Mousavian, A., Lien, J.M.: A comparative study of control architectures in UAV/UGV-based surveillance system. In: IIE Annual Conference. Proceedings. Institute of Industrial and Systems Engineers (IISE), p. 3455 (2014)

49.

Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization (2014). Preprint arXiv:1412.6980

50.

Kober, J., Peters, J.R.: Policy search for motor primitives in robotics. In: Advances in Neural Information Processing Systems, pp. 849–856 (2009)

51.

Koenig, N., Howard, A.: Gazebo-3d multiple robot simulator with dynamics (2006)

52.

Kolling, A., Walker, P., Chakraborty, N., Sycara, K., Lewis, M.: Human interaction with robot swarms: A survey. IEEE Trans. Human-Mach. Syst. 46(1), 9–26 (2015)CrossRef

53.

Konidaris, G., Osentoski, S., Thomas, P.S.: Value function approximation in reinforcement learning using the fourier basis. In: Association for the Advancement of Artificial Intelligence, vol. 6, p. 7 (2011)

54.

Kormushev, P., Calinon, S., Caldwell, D.G.: Robot motor skill coordination with em-based reinforcement learning. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3232–3237. IEEE, Piscataway (2010)

55.

Kormushev, P., Calinon, S., Saegusa, R., Metta, G.: Learning the skill of archery by a humanoid robot ICub. In: 2010 10th IEEE-RAS International Conference on Humanoid Robots, pp. 417–423. IEEE, Piscataway (2010)

56.

Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

57.

Kulić, D., Ott, C., Lee, D., Ishikawa, J., Nakamura, Y.: Incremental learning of full body motion primitives and their sequencing through human motion observation. Int. J. Rob. Res. 31(3), 330–345 (2012)CrossRef

58.

LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)CrossRef

59.

Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. J. Mach. Learn. Res. 17(1), 1334–1373 (2016)MathSciNetMATH

60.

Li, J., Monroe, W., Ritter, A., Galley, M., Gao, J., Jurafsky, D.: Deep reinforcement learning for dialogue generation (2016). Preprint arXiv:1606.01541

61.

Li, X., Chen, Y.N., Li, L., Gao, J., Celikyilmaz, A.: End-to-end task-completion neural dialogue systems (2017). Preprint arXiv:1703.01008

62.

Lin, S., Garratt, M.A., Lambert, A.J.: Monocular vision-based real-time target recognition and tracking for autonomously landing an uav in a cluttered shipboard environment. Autonom. Rob. 41(4), 881–901 (2017)CrossRef

63.

Liu, M., Amato, C., Anesta, E.P., Griffith, J.D., How, J.P.: Learning for decentralized control of multiagent systems in large, partially-observable stochastic environments. In: Thirtieth AAAI Conference on Artificial Intelligence (2016)

64.

Long, N.K., Sammut, K., Sgarioto, D., Garratt, M., Abbass, H.A.: A comprehensive review of shepherding as a bio-inspired swarm-robotics guidance approach. IEEE Trans. Emer. Topics Comput. Intell. 4, 523–537 (2020)CrossRef

65.

Mangin, O., Oudeyer, P.Y.: Unsupervised learning of simultaneous motor primitives through imitation. In: Frontiers in Computational Neuroscience Conference Abstract: IEEE ICDL-EPIROB 2011 (2011)

66.

Martinez, S., Cortes, J., Bullo, F.: Motion coordination with distributed information. IEEE Control Syst. Mag. 27(4), 75–88 (2007)CrossRef

67.

Mirowski, P., Pascanu, R., Viola, F., Soyer, H., Ballard, A.J., Banino, A., Denil, M., Goroshin, R., Sifre, L., Kavukcuoglu, K., et al.: Learning to navigate in complex environments (2016). Preprint arXiv:1611.03673

68.

Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)CrossRef

69.

Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D., Kavukcuoglu, K.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937 (2016)

70.

Mülling, K., Kober, J., Kroemer, O., Peters, J.: Learning to select and generalize striking movements in robot table tennis. Int. J. Rob. Res. 32(3), 263–279 (2013)CrossRef

71.

Nguyen, H.T., Garratt, M., Bui, L.T., Abbass, H.: Supervised deep actor network for imitation learning in a ground-air UAV-UGVs coordination task. In: 2017 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1–8. IEEE, Piscataway (2017)

72.

Nguyen, H., Garratt, M., Abbass, H.: Apprenticeship bootstrapping. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE, Piscataway (2018)

73.

Nguyen, H., Tran, V., Nguyen, T., Garratt, M., Kasmarik, K., Barlow, M., Anavatti, S., Abbass, H.: Apprenticeship bootstrapping via deep learning with a safety net for UAV-UGV interaction (2018). Preprint arXiv:1810.04344

74.

Nguyen, T., Nguyen, H., Debie, E., Kasmarik, K., Garratt, M., Abbass, H.: Swarm Q-Leaming with knowledge sharing within environments for formation control. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE, Piscataway (2018)

75.

Nguyen, H.T., Garratt, M., Bui, L.T., Abbass, H.: Apprenticeship learning for continuous state spaces and actions in a swarm-guidance shepherding task. In: 2019 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 102–109. IEEE, Piscataway (2019)

76.

Nguyen, H.T., Nguyen, T.D., Garratt, M., Kasmarik, K., Anavatti, S., Barlow, M., Abbass, H.A.: A deep hierarchical reinforcement learner for aerial shepherding of ground swarms. In: International Conference on Neural Information Processing, pp. 658–669. Springer, Berlin (2019)

77.

Niekum, S., Osentoski, S., Konidaris, G., Barto, A.G.: Learning and generalization of complex tasks from unstructured demonstrations. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5239–5246. IEEE, Piscataway (2012)

78.

Oh, K.K., Park, M.C., Ahn, H.S.: A survey of multi-agent formation control. Automatica 53, 424–440 (2015)MathSciNetCrossRef

79.

Oh, H., Shirazi, A.R., Sun, C., Jin, Y.: Bio-inspired self-organising multi-robot pattern formation: a review. Rob. Auton. Syst. 91, 83–100 (2017)CrossRef

80.

Palmer, G., Tuyls, K., Bloembergen, D., Savani, R.: Lenient multi-agent deep reinforcement learning. In: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, pp. 443–451. International Foundation for Autonomous Agents and Multiagent Systems (2018)

81.

Parker, L.: Multiple Mobile Robot Systems, pp. 921–941. Springer, Berlin (2008). https://doi.org/10.1007/978-3-540-30301-5_41

82.

Pastor, P., Kalakrishnan, M., Chitta, S., Theodorou, E., Schaal, S.: Skill learning and task outcome prediction for manipulation. In: 2011 IEEE International Conference on Robotics and Automation, pp. 3828–3834. IEEE, Piscataway (2011)

83.

Pendleton, S.D., Andersen, H., Du, X., Shen, X., Meghjani, M., Eng, Y.H., Rus, D., Ang, M.H.: Perception, planning, control, and coordination for autonomous vehicles. Machines 5(1), 6 (2017)CrossRef

84.

Ramachandran, D., Amir, E.: Bayesian inverse reinforcement learning. In: International Joint Conferences on Artificial Intelligence, vol. 7, pp. 2586–2591 (2007)

85.

Ross, S., Melik-Barkhudarov, N., Shankar, K.S., Wendel, A., Dey, D., Bagnell, J.A., Hebert, M.: Learning monocular reactive uav control in cluttered natural environments. In: 2013 IEEE International Conference on Robotics and Automation, pp. 1765–1772. IEEE, Piscataway (2013)

86.

Sen, A., Sahoo, S.R., Kothari, M.: Cooperative formation control strategy in heterogeneous network with bounded acceleration. In: 2017 Indian Control Conference (ICC), pp. 344–349. IEEE, Piscataway (2017)

87.

Skoglund, A., Iliev, B., Kadmiry, B., Palm, R.: Programming by demonstration of pick-and-place tasks for industrial manipulators using task primitives. In: 2007 International Symposium on Computational Intelligence in Robotics and Automation, pp. 368–373. IEEE, Piscataway (2007)

88.

Song, J., Ren, H., Sadigh, D., Ermon, S.: Multi-agent generative adversarial imitation learning. In: Advances in Neural Information Processing Systems, pp. 7461–7472 (2018)

89.

Speck, C., Bucci, D.J.: Distributed UAV swarm formation control via object-focused, multi-objective SARSA. In: 2018 Annual American Control Conference (ACC), pp. 6596–6601. IEEE, Piscataway (2018)

90.

Strömbom, D., Mann, R.P., Wilson, A.M., Hailes, S., Morton, A.J., Sumpter, D.J.T., King, A.J.: Solving the shepherding problem: heuristics for herding autonomous, interacting agents. J. R. Soc. Interf. 11(100) (2014). https://browzine.com/articles/52614503

91.

Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, vol. 1. MIT Press, Cambridge (1998)MATH

92.

Szepesvári, C.: Algorithms for reinforcement learning. Synth. Lect. Artif. Intell. Mach. Learn. 4(1), 1–103 (2010)CrossRef

93.

Trentini, M., Beckman, B.: Semi-autonomous UAV/UGV for dismounted urban operations. In: Unmanned Systems Technology XII, vol. 7692, p. 76921C. International Society for Optics and Photonics (2010)

94.

Tsitsiklis, J.N., Van Roy, B.: Analysis of temporal-diffference learning with function approximation. In: Advances in Neural Information Processing Systems, pp. 1075–1081 (1997)

95.

Vidal, R., Rashid, S., Sharp, C., Shakernia, O., Kim, J., Sastry, S.: Pursuit-evasion games with unmanned ground and aerial vehicles. In: Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No. 01CH37164), vol. 3, pp. 2948–2955. IEEE, Piscataway (2001)

96.

Waslander, S.L.: Unmanned aerial and ground vehicle teams: Recent work and open problems. In: Autonomous Control Systems and Vehicles, pp. 21–36. Springer, Berlin (2013)

97.

Wulfmeier, M., Ondruska, P., Posner, I.: Maximum entropy deep inverse reinforcement learning (2015). Preprint arXiv:1507.04888

98.

Xu, D., Zhang, X., Zhu, Z., Chen, C., Yang, P.: Behavior-based formation control of swarm robots. Math. Problems Eng. 2014, 205759 (2014)

99.

Yang, Z., Merrick, K., Jin, L., Abbass, H.A.: Hierarchical deep reinforcement learning for continuous action control. IEEE Trans. Neur. Netw. Learn. Syst. (99), 1–11 (2018)MathSciNet

100.

Yoshikai, T., Otake, N., Mizuuchi, I., Inaba, M., Inoue, H.: Development of an imitation behavior in humanoid kenta with reinforcement learning algorithm based on the attention during imitation. In: 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No. 04CH37566), vol. 2, pp. 1192–1197. IEEE, Piscataway (2004)

101.

You, C., Lu, J., Filev, D., Tsiotras, P.: Advanced planning for autonomous vehicles using reinforcement learning and deep inverse reinforcement learning. Rob. Auton. Syst. 114, 1–18 (2019)CrossRef

102.

Yu, H., Beard, R.W., Argyle, M., Chamberlain, C.: Probabilistic path planning for cooperative target tracking using aerial and ground vehicles. In: Proceedings of the 2011 American Control Conference, pp. 4673–4678. IEEE, Piscataway (2011)

103.

Zhang, T., Kahn, G., Levine, S., Abbeel, P.: Learning deep control policies for autonomous aerial vehicles with mpc-guided policy search. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 528–535. IEEE, Piscataway (2016)

104.

Zhang, T., Li, Q., Zhang, C.s., Liang, H.w., Li, P., Wang, T.m., Li, S., Zhu, Y.l., Wu, C.: Current trends in the development of intelligent unmanned autonomous systems. Front. Inf. Technol. Electron. Eng. 18(1), 68–85 (2017)

105.

Zhan, E., Zheng, S., Yue, Y., Sha, L., Lucey, P.: Generative multi-agent behavioral cloning. In: Proceedings of the 35th International Conference on Machine Learning (2018)

106.

Ziebart, B.D., Maas, A.L., Bagnell, J.A., Dey, A.K.: Maximum entropy inverse reinforcement learning. In: Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, vol. 8, pp. 1433–1438. Chicago (2008)

Title: Apprenticeship Bootstrapping Reinforcement Learning for Sky Shepherding of a Ground Swarm in Gazebo
Authors: Hung Nguyen
Matthew Garratt
Hussein A. Abbass
Publisher: Springer International Publishing
Book: Shepherding UxVs for Human-Swarm Teaming
Print ISBN: 978-3-030-60897-2

Electronic ISBN: 978-3-030-60898-9

Copyright Year: 2021
DOI: https://doi.org/10.1007/978-3-030-60898-9_10

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner