Skip to main content
Top

2021 | OriginalPaper | Chapter

10. Apprenticeship Bootstrapping Reinforcement Learning for Sky Shepherding of a Ground Swarm in Gazebo

Authors : Hung Nguyen, Matthew Garratt, Hussein A. Abbass

Published in: Shepherding UxVs for Human-Swarm Teaming

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The coordination of unmanned air–ground vehicles has been an active area due to the significant advantages of this coordination wherein unmanned air vehicles (UAVs) have a wide field of view, enabling them to effectively guide a swarm of unmanned ground vehicles (UGVs). Due to significant recent advances in artificial intelligence (AI), autonomous agents are being used to design more robust coordination of air–ground systems, reducing the intervention load of human operators and increasing the autonomy of unmanned air–ground systems. A guidance and control shepherding system design allows for single learning agent to influence and manage a larger swarm of rule-based entities. In this chapter, we present a learning algorithm for a sky shepherd-guiding rule-based AI-driven UGVs. The apprenticeship bootstrapping learning algorithm is introduced and is applied to the aerial shepherding task.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
1.
go back to reference Abbeel, P., Ng, A.Y.: Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the Twenty-First International Conference on Machine Learning, p. 1. ACM, New York (2004) Abbeel, P., Ng, A.Y.: Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the Twenty-First International Conference on Machine Learning, p. 1. ACM, New York (2004)
2.
go back to reference Abbeel, P., Coates, A., Ng, A.Y.: Autonomous helicopter aerobatics through apprenticeship learning. Int. J. Rob. Res. 29(13), 1608–1639 (2010)CrossRef Abbeel, P., Coates, A., Ng, A.Y.: Autonomous helicopter aerobatics through apprenticeship learning. Int. J. Rob. Res. 29(13), 1608–1639 (2010)CrossRef
3.
go back to reference Aghaeeyan, A., Abdollahi, F., Talebi, H.A.: UAV–UGVs cooperation: with a moving center based trajectory. Rob. Auton. Syst. 63, 1–9 (2015)CrossRef Aghaeeyan, A., Abdollahi, F., Talebi, H.A.: UAV–UGVs cooperation: with a moving center based trajectory. Rob. Auton. Syst. 63, 1–9 (2015)CrossRef
4.
go back to reference Argall, B.D., Chernova, S., Veloso, M., Browning, B.: A survey of robot learning from demonstration. Rob. Auton. Syst. 57(5), 469–483 (2009)CrossRef Argall, B.D., Chernova, S., Veloso, M., Browning, B.: A survey of robot learning from demonstration. Rob. Auton. Syst. 57(5), 469–483 (2009)CrossRef
5.
go back to reference Arora, S., Doshi, P.: A survey of inverse reinforcement learning: Challenges, methods and progress (2018). Preprint arXiv:1806.06877 Arora, S., Doshi, P.: A survey of inverse reinforcement learning: Challenges, methods and progress (2018). Preprint arXiv:1806.06877
6.
go back to reference Balch, T., Arkin, R.C.: Behavior-based formation control for multirobot teams. IEEE Trans. Robot. Autom. 14(6), 926–939 (1998)CrossRef Balch, T., Arkin, R.C.: Behavior-based formation control for multirobot teams. IEEE Trans. Robot. Autom. 14(6), 926–939 (1998)CrossRef
7.
go back to reference Baumann, M., Büning, H.K.: Learning shepherding behavior. Ph.D. Thesis, University of Paderborn (2016) Baumann, M., Büning, H.K.: Learning shepherding behavior. Ph.D. Thesis, University of Paderborn (2016)
8.
go back to reference Baxter, J.L., Burke, E., Garibaldi, J.M., Norman, M.: Multi-robot search and rescue: A potential field based approach. In: Autonomous Robots and Agents, pp. 9–16. Springer, Berlin (2007) Baxter, J.L., Burke, E., Garibaldi, J.M., Norman, M.: Multi-robot search and rescue: A potential field based approach. In: Autonomous Robots and Agents, pp. 9–16. Springer, Berlin (2007)
9.
go back to reference Beard, R.W., Lawton, J., Hadaegh, F.Y.: A coordination architecture for spacecraft formation control. IEEE Trans. Control Syst. Technol. 9(6), 777–790 (2001)CrossRef Beard, R.W., Lawton, J., Hadaegh, F.Y.: A coordination architecture for spacecraft formation control. IEEE Trans. Control Syst. Technol. 9(6), 777–790 (2001)CrossRef
10.
go back to reference Bentivegna, D.C., Atkeson, C.G., Cheng, G.: Learning tasks from observation and practice. Rob. Auton. Syst. 47(2–3), 163–169 (2004)CrossRef Bentivegna, D.C., Atkeson, C.G., Cheng, G.: Learning tasks from observation and practice. Rob. Auton. Syst. 47(2–3), 163–169 (2004)CrossRef
11.
go back to reference Billard, A.G., Calinon, S., Dillmann, R.: Learning from humans. In: Springer Handbook of Robotics, pp. 1995–2014. Springer, Berlin (2016) Billard, A.G., Calinon, S., Dillmann, R.: Learning from humans. In: Springer Handbook of Robotics, pp. 1995–2014. Springer, Berlin (2016)
12.
go back to reference Billing, E.A., Hellström, T.: A formalism for learning from demonstration. Paladyn 1(1), 1–13 (2010) Billing, E.A., Hellström, T.: A formalism for learning from demonstration. Paladyn 1(1), 1–13 (2010)
13.
go back to reference Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., Jackel, L.D., Monfort, M., Muller, U., Zhang, J., et al.: End to end learning for self-driving cars (2016). Preprint arXiv:1604.07316 Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., Jackel, L.D., Monfort, M., Muller, U., Zhang, J., et al.: End to end learning for self-driving cars (2016). Preprint arXiv:1604.07316
14.
go back to reference Bojarski, M., Yeres, P., Choromanska, A., Choromanski, K., Firner, B., Jackel, L., Muller, U.: Explaining how a deep neural network trained with end-to-end learning steers a car (2017). Preprint arXiv:1704.07911 Bojarski, M., Yeres, P., Choromanska, A., Choromanski, K., Firner, B., Jackel, L., Muller, U.: Explaining how a deep neural network trained with end-to-end learning steers a car (2017). Preprint arXiv:1704.07911
15.
go back to reference Carelli, R., De la Cruz, C., Roberti, F.: Centralized formation control of non-holonomic mobile robots. Lat. Am. Appl. Res. 36(2), 63–69 (2006) Carelli, R., De la Cruz, C., Roberti, F.: Centralized formation control of non-holonomic mobile robots. Lat. Am. Appl. Res. 36(2), 63–69 (2006)
16.
go back to reference Carrio, A., Sampedro, C., Rodriguez-Ramos, A., Campoy, P.: A review of deep learning methods and applications for unmanned aerial vehicles. J. Sensors 2017, 3296874 (2017) Carrio, A., Sampedro, C., Rodriguez-Ramos, A., Campoy, P.: A review of deep learning methods and applications for unmanned aerial vehicles. J. Sensors 2017, 3296874 (2017)
17.
go back to reference Chaimowicz, L., Kumar, V.: Aerial shepherds: Coordination among UAVS and swarms of robots. In: Alami, R., Chatila, R., Asama, H. (eds.) Distributed Autonomous Robotic Systems, vol. 6, pp. 243–252. Springer Japan, Tokyo (2007)CrossRef Chaimowicz, L., Kumar, V.: Aerial shepherds: Coordination among UAVS and swarms of robots. In: Alami, R., Chatila, R., Asama, H. (eds.) Distributed Autonomous Robotic Systems, vol. 6, pp. 243–252. Springer Japan, Tokyo (2007)CrossRef
18.
go back to reference Chen, J., Zhang, X., Xin, B., Fang, H.: Coordination between unmanned aerial and ground vehicles: A taxonomy and optimization perspective. IEEE Trans. Cybern. 46(4), 959–972 (2016)CrossRef Chen, J., Zhang, X., Xin, B., Fang, H.: Coordination between unmanned aerial and ground vehicles: A taxonomy and optimization perspective. IEEE Trans. Cybern. 46(4), 959–972 (2016)CrossRef
21.
go back to reference Daniel, C., Neumann, G., Peters, J.: Learning concurrent motor skills in versatile solution spaces. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3591–3597. IEEE, Piscataway (2012) Daniel, C., Neumann, G., Peters, J.: Learning concurrent motor skills in versatile solution spaces. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3591–3597. IEEE, Piscataway (2012)
22.
go back to reference Daw, n.d., Niv, Y., Dayan, P.: Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8(12), 1704–1711 (2005) Daw, n.d., Niv, Y., Dayan, P.: Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8(12), 1704–1711 (2005)
23.
go back to reference Dillmann, R.: Teaching and learning of robot tasks via observation of human performance. Rob. Auton. Syst. 47(2–3), 109–116 (2004)CrossRef Dillmann, R.: Teaching and learning of robot tasks via observation of human performance. Rob. Auton. Syst. 47(2–3), 109–116 (2004)CrossRef
24.
go back to reference Duan, H., Li, P.: Bio-Inspired Computation in Unmanned Aerial Vehicles. Springer, Berlin (2014)CrossRef Duan, H., Li, P.: Bio-Inspired Computation in Unmanned Aerial Vehicles. Springer, Berlin (2014)CrossRef
25.
go back to reference Dunk, I., Abbass, H.: Emergence of order in leader-follower boids-inspired systems. In: 2016 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1–8. IEEE, Piscataway (2016) Dunk, I., Abbass, H.: Emergence of order in leader-follower boids-inspired systems. In: 2016 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1–8. IEEE, Piscataway (2016)
26.
go back to reference Farinelli, A., Iocchi, L., Nardi, D.: Multirobot systems: a classification focused on coordination. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 34(5), 2015–2028 (2004) Farinelli, A., Iocchi, L., Nardi, D.: Multirobot systems: a classification focused on coordination. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 34(5), 2015–2028 (2004)
27.
go back to reference Fernandez-Rojas, R., Perry, A., Singh, H., Campbell, B., Elsayed, S., Hunjet, R., Abbass, H.A.: Contextual awareness in human-advanced-vehicle systems: A survey. IEEE Access 7, 33304–33328 (2019)CrossRef Fernandez-Rojas, R., Perry, A., Singh, H., Campbell, B., Elsayed, S., Hunjet, R., Abbass, H.A.: Contextual awareness in human-advanced-vehicle systems: A survey. IEEE Access 7, 33304–33328 (2019)CrossRef
28.
go back to reference Fraser, B., Hunjet, R.: Data ferrying in tactical networks using swarm intelligence and stigmergic coordination. In: 2016 26th International Telecommunication Networks and Applications Conference (ITNAC), pp. 1–6. IEEE, Piscataway (2016) Fraser, B., Hunjet, R.: Data ferrying in tactical networks using swarm intelligence and stigmergic coordination. In: 2016 26th International Telecommunication Networks and Applications Conference (ITNAC), pp. 1–6. IEEE, Piscataway (2016)
29.
go back to reference Gee, A., Abbass, H.: Transparent machine education of neural networks for swarm shepherding using curriculum design. In: Proceedings of the International Joint Conference on Neural Networks (2019) Gee, A., Abbass, H.: Transparent machine education of neural networks for swarm shepherding using curriculum design. In: Proceedings of the International Joint Conference on Neural Networks (2019)
30.
go back to reference Glavic, M., Fonteneau, R., Ernst, D.: Reinforcement learning for electric power system decision and control: Past considerations and perspectives. IFAC-PapersOnLine 50(1), 6918–6927 (2017)CrossRef Glavic, M., Fonteneau, R., Ernst, D.: Reinforcement learning for electric power system decision and control: Past considerations and perspectives. IFAC-PapersOnLine 50(1), 6918–6927 (2017)CrossRef
31.
go back to reference Graves, A., Mohamed, A.r., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6645–6649. IEEE, Piscataway (2013) Graves, A., Mohamed, A.r., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6645–6649. IEEE, Piscataway (2013)
32.
go back to reference Grollman, D.H., Jenkins, O.C.: Incremental learning of subtasks from unsegmented demonstration. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 261–266. IEEE, Piscataway (2010) Grollman, D.H., Jenkins, O.C.: Incremental learning of subtasks from unsegmented demonstration. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 261–266. IEEE, Piscataway (2010)
33.
go back to reference Grounds, M., Kudenko, D.: Parallel reinforcement learning with linear function approximation. In: Adaptive Agents and Multi-Agent Systems III. Adaptation and Multi-Agent Learning, pp. 60–74. Springer, Berlin (2008) Grounds, M., Kudenko, D.: Parallel reinforcement learning with linear function approximation. In: Adaptive Agents and Multi-Agent Systems III. Adaptation and Multi-Agent Learning, pp. 60–74. Springer, Berlin (2008)
34.
go back to reference Guenter, F., Hersch, M., Calinon, S., Billard, A.: Reinforcement learning for imitating constrained reaching movements. Adv. Rob. 21(13), 1521–1544 (2007)CrossRef Guenter, F., Hersch, M., Calinon, S., Billard, A.: Reinforcement learning for imitating constrained reaching movements. Adv. Rob. 21(13), 1521–1544 (2007)CrossRef
35.
go back to reference Guillet, A., Lenain, R., Thuilot, B., Rousseau, V.: Formation control of agricultural mobile robots: A bidirectional weighted constraints approach. J. Field Rob. 34, 1260–1274 (2017)CrossRef Guillet, A., Lenain, R., Thuilot, B., Rousseau, V.: Formation control of agricultural mobile robots: A bidirectional weighted constraints approach. J. Field Rob. 34, 1260–1274 (2017)CrossRef
36.
go back to reference Guo, X., Denman, S., Fookes, C., Mejias, L., Sridharan, S.: Automatic UAV forced landing site detection using machine learning. In: 2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA), pp. 1–7. IEEE, Piscataway (2014) Guo, X., Denman, S., Fookes, C., Mejias, L., Sridharan, S.: Automatic UAV forced landing site detection using machine learning. In: 2014 International Conference on Digital Image Computing: Techniques and Applications (DICTA), pp. 1–7. IEEE, Piscataway (2014)
37.
go back to reference Gupta, J.K., Egorov, M., Kochenderfer, M.: Cooperative multi-agent control using deep reinforcement learning. In: International Conference on Autonomous Agents and Multiagent Systems, pp. 66–83. Springer, Berlin (2017) Gupta, J.K., Egorov, M., Kochenderfer, M.: Cooperative multi-agent control using deep reinforcement learning. In: International Conference on Autonomous Agents and Multiagent Systems, pp. 66–83. Springer, Berlin (2017)
38.
go back to reference Howitt, S., Richards, D.: The human machine interface for airborne control of UAVS. In: 2nd AIAA “Unmanned Unlimited” Conference and Workshop & Exhibit, p. 6593 (2003) Howitt, S., Richards, D.: The human machine interface for airborne control of UAVS. In: 2nd AIAA “Unmanned Unlimited” Conference and Workshop & Exhibit, p. 6593 (2003)
40.
go back to reference Hudjakov, R., Tamre, M.: Aerial imagery terrain classification for long-range autonomous navigation. In: 2009 International Symposium on Optomechatronic Technologies, pp. 88–91. IEEE, Piscataway (2009) Hudjakov, R., Tamre, M.: Aerial imagery terrain classification for long-range autonomous navigation. In: 2009 International Symposium on Optomechatronic Technologies, pp. 88–91. IEEE, Piscataway (2009)
41.
go back to reference Hunjet, R., Stevens, T., Elliot, M., Fraser, B., George, P.: Survivable communications and autonomous delivery service a generic swarming framework enabling communications in contested environments. In: MILCOM 2017–2017 IEEE Military Communications Conference (MILCOM), pp. 788–793. IEEE, Piscataway (2017) Hunjet, R., Stevens, T., Elliot, M., Fraser, B., George, P.: Survivable communications and autonomous delivery service a generic swarming framework enabling communications in contested environments. In: MILCOM 2017–2017 IEEE Military Communications Conference (MILCOM), pp. 788–793. IEEE, Piscataway (2017)
42.
go back to reference Hussein, A., Gaber, M.M., Elyan, E., Jayne, C.: Imitation learning: A survey of learning methods. ACM Comput. Surv. (CSUR) 50(2), 21 (2017) Hussein, A., Gaber, M.M., Elyan, E., Jayne, C.: Imitation learning: A survey of learning methods. ACM Comput. Surv. (CSUR) 50(2), 21 (2017)
43.
go back to reference Hwang, Y.K., Choi, K.J., Hong, D.S.: Self-learning control of cooperative motion for a humanoid robot. In: Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006, pp. 475–480. IEEE, Piscataway (2006) Hwang, Y.K., Choi, K.J., Hong, D.S.: Self-learning control of cooperative motion for a humanoid robot. In: Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006, pp. 475–480. IEEE, Piscataway (2006)
44.
go back to reference Iima, H., Kuroe, Y.: Swarm reinforcement learning method for a multi-robot formation problem. In: 2013 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 2298–2303. IEEE, Piscataway (2013) Iima, H., Kuroe, Y.: Swarm reinforcement learning method for a multi-robot formation problem. In: 2013 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 2298–2303. IEEE, Piscataway (2013)
45.
go back to reference Jansen, B., Belpaeme, T.: A computational model of intention reading in imitation. Rob. Auton. Syst. 54(5), 394–402 (2006)CrossRef Jansen, B., Belpaeme, T.: A computational model of intention reading in imitation. Rob. Auton. Syst. 54(5), 394–402 (2006)CrossRef
46.
go back to reference Justesen, N., Risi, S.: Learning macromanagement in starcraft from replays using deep learning. In: 2017 IEEE Conference on Computational Intelligence and Games (CIG), pp. 162–169. IEEE, Piscataway (2017) Justesen, N., Risi, S.: Learning macromanagement in starcraft from replays using deep learning. In: 2017 IEEE Conference on Computational Intelligence and Games (CIG), pp. 162–169. IEEE, Piscataway (2017)
47.
go back to reference Khaleghi, A.M., Xu, D., Minaeian, S., Li, M., Yuan, Y., Liu, J., Son, Y.J., Vo, C., Lien, J.M.: A dddams-based UAV and UGV team formation approach for surveillance and crowd control. In: Proceedings of the 2014 Winter Simulation Conference, pp. 2907–2918. IEEE Press, Piscataway (2014) Khaleghi, A.M., Xu, D., Minaeian, S., Li, M., Yuan, Y., Liu, J., Son, Y.J., Vo, C., Lien, J.M.: A dddams-based UAV and UGV team formation approach for surveillance and crowd control. In: Proceedings of the 2014 Winter Simulation Conference, pp. 2907–2918. IEEE Press, Piscataway (2014)
48.
go back to reference Khaleghi, A.M., Xu, D., Minaeian, S., Li, M., Yuan, Y., Liu, J., Son, Y.J., Vo, C., Mousavian, A., Lien, J.M.: A comparative study of control architectures in UAV/UGV-based surveillance system. In: IIE Annual Conference. Proceedings. Institute of Industrial and Systems Engineers (IISE), p. 3455 (2014) Khaleghi, A.M., Xu, D., Minaeian, S., Li, M., Yuan, Y., Liu, J., Son, Y.J., Vo, C., Mousavian, A., Lien, J.M.: A comparative study of control architectures in UAV/UGV-based surveillance system. In: IIE Annual Conference. Proceedings. Institute of Industrial and Systems Engineers (IISE), p. 3455 (2014)
49.
go back to reference Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization (2014). Preprint arXiv:1412.6980 Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization (2014). Preprint arXiv:1412.6980
50.
go back to reference Kober, J., Peters, J.R.: Policy search for motor primitives in robotics. In: Advances in Neural Information Processing Systems, pp. 849–856 (2009) Kober, J., Peters, J.R.: Policy search for motor primitives in robotics. In: Advances in Neural Information Processing Systems, pp. 849–856 (2009)
51.
go back to reference Koenig, N., Howard, A.: Gazebo-3d multiple robot simulator with dynamics (2006) Koenig, N., Howard, A.: Gazebo-3d multiple robot simulator with dynamics (2006)
52.
go back to reference Kolling, A., Walker, P., Chakraborty, N., Sycara, K., Lewis, M.: Human interaction with robot swarms: A survey. IEEE Trans. Human-Mach. Syst. 46(1), 9–26 (2015)CrossRef Kolling, A., Walker, P., Chakraborty, N., Sycara, K., Lewis, M.: Human interaction with robot swarms: A survey. IEEE Trans. Human-Mach. Syst. 46(1), 9–26 (2015)CrossRef
53.
go back to reference Konidaris, G., Osentoski, S., Thomas, P.S.: Value function approximation in reinforcement learning using the fourier basis. In: Association for the Advancement of Artificial Intelligence, vol. 6, p. 7 (2011) Konidaris, G., Osentoski, S., Thomas, P.S.: Value function approximation in reinforcement learning using the fourier basis. In: Association for the Advancement of Artificial Intelligence, vol. 6, p. 7 (2011)
54.
go back to reference Kormushev, P., Calinon, S., Caldwell, D.G.: Robot motor skill coordination with em-based reinforcement learning. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3232–3237. IEEE, Piscataway (2010) Kormushev, P., Calinon, S., Caldwell, D.G.: Robot motor skill coordination with em-based reinforcement learning. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3232–3237. IEEE, Piscataway (2010)
55.
go back to reference Kormushev, P., Calinon, S., Saegusa, R., Metta, G.: Learning the skill of archery by a humanoid robot ICub. In: 2010 10th IEEE-RAS International Conference on Humanoid Robots, pp. 417–423. IEEE, Piscataway (2010) Kormushev, P., Calinon, S., Saegusa, R., Metta, G.: Learning the skill of archery by a humanoid robot ICub. In: 2010 10th IEEE-RAS International Conference on Humanoid Robots, pp. 417–423. IEEE, Piscataway (2010)
56.
go back to reference Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
57.
go back to reference Kulić, D., Ott, C., Lee, D., Ishikawa, J., Nakamura, Y.: Incremental learning of full body motion primitives and their sequencing through human motion observation. Int. J. Rob. Res. 31(3), 330–345 (2012)CrossRef Kulić, D., Ott, C., Lee, D., Ishikawa, J., Nakamura, Y.: Incremental learning of full body motion primitives and their sequencing through human motion observation. Int. J. Rob. Res. 31(3), 330–345 (2012)CrossRef
58.
go back to reference LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)CrossRef LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)CrossRef
59.
go back to reference Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. J. Mach. Learn. Res. 17(1), 1334–1373 (2016)MathSciNetMATH Levine, S., Finn, C., Darrell, T., Abbeel, P.: End-to-end training of deep visuomotor policies. J. Mach. Learn. Res. 17(1), 1334–1373 (2016)MathSciNetMATH
60.
go back to reference Li, J., Monroe, W., Ritter, A., Galley, M., Gao, J., Jurafsky, D.: Deep reinforcement learning for dialogue generation (2016). Preprint arXiv:1606.01541 Li, J., Monroe, W., Ritter, A., Galley, M., Gao, J., Jurafsky, D.: Deep reinforcement learning for dialogue generation (2016). Preprint arXiv:1606.01541
61.
go back to reference Li, X., Chen, Y.N., Li, L., Gao, J., Celikyilmaz, A.: End-to-end task-completion neural dialogue systems (2017). Preprint arXiv:1703.01008 Li, X., Chen, Y.N., Li, L., Gao, J., Celikyilmaz, A.: End-to-end task-completion neural dialogue systems (2017). Preprint arXiv:1703.01008
62.
go back to reference Lin, S., Garratt, M.A., Lambert, A.J.: Monocular vision-based real-time target recognition and tracking for autonomously landing an uav in a cluttered shipboard environment. Autonom. Rob. 41(4), 881–901 (2017)CrossRef Lin, S., Garratt, M.A., Lambert, A.J.: Monocular vision-based real-time target recognition and tracking for autonomously landing an uav in a cluttered shipboard environment. Autonom. Rob. 41(4), 881–901 (2017)CrossRef
63.
go back to reference Liu, M., Amato, C., Anesta, E.P., Griffith, J.D., How, J.P.: Learning for decentralized control of multiagent systems in large, partially-observable stochastic environments. In: Thirtieth AAAI Conference on Artificial Intelligence (2016) Liu, M., Amato, C., Anesta, E.P., Griffith, J.D., How, J.P.: Learning for decentralized control of multiagent systems in large, partially-observable stochastic environments. In: Thirtieth AAAI Conference on Artificial Intelligence (2016)
64.
go back to reference Long, N.K., Sammut, K., Sgarioto, D., Garratt, M., Abbass, H.A.: A comprehensive review of shepherding as a bio-inspired swarm-robotics guidance approach. IEEE Trans. Emer. Topics Comput. Intell. 4, 523–537 (2020)CrossRef Long, N.K., Sammut, K., Sgarioto, D., Garratt, M., Abbass, H.A.: A comprehensive review of shepherding as a bio-inspired swarm-robotics guidance approach. IEEE Trans. Emer. Topics Comput. Intell. 4, 523–537 (2020)CrossRef
65.
go back to reference Mangin, O., Oudeyer, P.Y.: Unsupervised learning of simultaneous motor primitives through imitation. In: Frontiers in Computational Neuroscience Conference Abstract: IEEE ICDL-EPIROB 2011 (2011) Mangin, O., Oudeyer, P.Y.: Unsupervised learning of simultaneous motor primitives through imitation. In: Frontiers in Computational Neuroscience Conference Abstract: IEEE ICDL-EPIROB 2011 (2011)
66.
go back to reference Martinez, S., Cortes, J., Bullo, F.: Motion coordination with distributed information. IEEE Control Syst. Mag. 27(4), 75–88 (2007)CrossRef Martinez, S., Cortes, J., Bullo, F.: Motion coordination with distributed information. IEEE Control Syst. Mag. 27(4), 75–88 (2007)CrossRef
67.
go back to reference Mirowski, P., Pascanu, R., Viola, F., Soyer, H., Ballard, A.J., Banino, A., Denil, M., Goroshin, R., Sifre, L., Kavukcuoglu, K., et al.: Learning to navigate in complex environments (2016). Preprint arXiv:1611.03673 Mirowski, P., Pascanu, R., Viola, F., Soyer, H., Ballard, A.J., Banino, A., Denil, M., Goroshin, R., Sifre, L., Kavukcuoglu, K., et al.: Learning to navigate in complex environments (2016). Preprint arXiv:1611.03673
68.
go back to reference Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)CrossRef Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)CrossRef
69.
go back to reference Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D., Kavukcuoglu, K.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937 (2016) Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D., Kavukcuoglu, K.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937 (2016)
70.
go back to reference Mülling, K., Kober, J., Kroemer, O., Peters, J.: Learning to select and generalize striking movements in robot table tennis. Int. J. Rob. Res. 32(3), 263–279 (2013)CrossRef Mülling, K., Kober, J., Kroemer, O., Peters, J.: Learning to select and generalize striking movements in robot table tennis. Int. J. Rob. Res. 32(3), 263–279 (2013)CrossRef
71.
go back to reference Nguyen, H.T., Garratt, M., Bui, L.T., Abbass, H.: Supervised deep actor network for imitation learning in a ground-air UAV-UGVs coordination task. In: 2017 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1–8. IEEE, Piscataway (2017) Nguyen, H.T., Garratt, M., Bui, L.T., Abbass, H.: Supervised deep actor network for imitation learning in a ground-air UAV-UGVs coordination task. In: 2017 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1–8. IEEE, Piscataway (2017)
72.
go back to reference Nguyen, H., Garratt, M., Abbass, H.: Apprenticeship bootstrapping. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE, Piscataway (2018) Nguyen, H., Garratt, M., Abbass, H.: Apprenticeship bootstrapping. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE, Piscataway (2018)
73.
go back to reference Nguyen, H., Tran, V., Nguyen, T., Garratt, M., Kasmarik, K., Barlow, M., Anavatti, S., Abbass, H.: Apprenticeship bootstrapping via deep learning with a safety net for UAV-UGV interaction (2018). Preprint arXiv:1810.04344 Nguyen, H., Tran, V., Nguyen, T., Garratt, M., Kasmarik, K., Barlow, M., Anavatti, S., Abbass, H.: Apprenticeship bootstrapping via deep learning with a safety net for UAV-UGV interaction (2018). Preprint arXiv:1810.04344
74.
go back to reference Nguyen, T., Nguyen, H., Debie, E., Kasmarik, K., Garratt, M., Abbass, H.: Swarm Q-Leaming with knowledge sharing within environments for formation control. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE, Piscataway (2018) Nguyen, T., Nguyen, H., Debie, E., Kasmarik, K., Garratt, M., Abbass, H.: Swarm Q-Leaming with knowledge sharing within environments for formation control. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE, Piscataway (2018)
75.
go back to reference Nguyen, H.T., Garratt, M., Bui, L.T., Abbass, H.: Apprenticeship learning for continuous state spaces and actions in a swarm-guidance shepherding task. In: 2019 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 102–109. IEEE, Piscataway (2019) Nguyen, H.T., Garratt, M., Bui, L.T., Abbass, H.: Apprenticeship learning for continuous state spaces and actions in a swarm-guidance shepherding task. In: 2019 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 102–109. IEEE, Piscataway (2019)
76.
go back to reference Nguyen, H.T., Nguyen, T.D., Garratt, M., Kasmarik, K., Anavatti, S., Barlow, M., Abbass, H.A.: A deep hierarchical reinforcement learner for aerial shepherding of ground swarms. In: International Conference on Neural Information Processing, pp. 658–669. Springer, Berlin (2019) Nguyen, H.T., Nguyen, T.D., Garratt, M., Kasmarik, K., Anavatti, S., Barlow, M., Abbass, H.A.: A deep hierarchical reinforcement learner for aerial shepherding of ground swarms. In: International Conference on Neural Information Processing, pp. 658–669. Springer, Berlin (2019)
77.
go back to reference Niekum, S., Osentoski, S., Konidaris, G., Barto, A.G.: Learning and generalization of complex tasks from unstructured demonstrations. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5239–5246. IEEE, Piscataway (2012) Niekum, S., Osentoski, S., Konidaris, G., Barto, A.G.: Learning and generalization of complex tasks from unstructured demonstrations. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5239–5246. IEEE, Piscataway (2012)
78.
79.
go back to reference Oh, H., Shirazi, A.R., Sun, C., Jin, Y.: Bio-inspired self-organising multi-robot pattern formation: a review. Rob. Auton. Syst. 91, 83–100 (2017)CrossRef Oh, H., Shirazi, A.R., Sun, C., Jin, Y.: Bio-inspired self-organising multi-robot pattern formation: a review. Rob. Auton. Syst. 91, 83–100 (2017)CrossRef
80.
go back to reference Palmer, G., Tuyls, K., Bloembergen, D., Savani, R.: Lenient multi-agent deep reinforcement learning. In: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, pp. 443–451. International Foundation for Autonomous Agents and Multiagent Systems (2018) Palmer, G., Tuyls, K., Bloembergen, D., Savani, R.: Lenient multi-agent deep reinforcement learning. In: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, pp. 443–451. International Foundation for Autonomous Agents and Multiagent Systems (2018)
82.
go back to reference Pastor, P., Kalakrishnan, M., Chitta, S., Theodorou, E., Schaal, S.: Skill learning and task outcome prediction for manipulation. In: 2011 IEEE International Conference on Robotics and Automation, pp. 3828–3834. IEEE, Piscataway (2011) Pastor, P., Kalakrishnan, M., Chitta, S., Theodorou, E., Schaal, S.: Skill learning and task outcome prediction for manipulation. In: 2011 IEEE International Conference on Robotics and Automation, pp. 3828–3834. IEEE, Piscataway (2011)
83.
go back to reference Pendleton, S.D., Andersen, H., Du, X., Shen, X., Meghjani, M., Eng, Y.H., Rus, D., Ang, M.H.: Perception, planning, control, and coordination for autonomous vehicles. Machines 5(1), 6 (2017)CrossRef Pendleton, S.D., Andersen, H., Du, X., Shen, X., Meghjani, M., Eng, Y.H., Rus, D., Ang, M.H.: Perception, planning, control, and coordination for autonomous vehicles. Machines 5(1), 6 (2017)CrossRef
84.
go back to reference Ramachandran, D., Amir, E.: Bayesian inverse reinforcement learning. In: International Joint Conferences on Artificial Intelligence, vol. 7, pp. 2586–2591 (2007) Ramachandran, D., Amir, E.: Bayesian inverse reinforcement learning. In: International Joint Conferences on Artificial Intelligence, vol. 7, pp. 2586–2591 (2007)
85.
go back to reference Ross, S., Melik-Barkhudarov, N., Shankar, K.S., Wendel, A., Dey, D., Bagnell, J.A., Hebert, M.: Learning monocular reactive uav control in cluttered natural environments. In: 2013 IEEE International Conference on Robotics and Automation, pp. 1765–1772. IEEE, Piscataway (2013) Ross, S., Melik-Barkhudarov, N., Shankar, K.S., Wendel, A., Dey, D., Bagnell, J.A., Hebert, M.: Learning monocular reactive uav control in cluttered natural environments. In: 2013 IEEE International Conference on Robotics and Automation, pp. 1765–1772. IEEE, Piscataway (2013)
86.
go back to reference Sen, A., Sahoo, S.R., Kothari, M.: Cooperative formation control strategy in heterogeneous network with bounded acceleration. In: 2017 Indian Control Conference (ICC), pp. 344–349. IEEE, Piscataway (2017) Sen, A., Sahoo, S.R., Kothari, M.: Cooperative formation control strategy in heterogeneous network with bounded acceleration. In: 2017 Indian Control Conference (ICC), pp. 344–349. IEEE, Piscataway (2017)
87.
go back to reference Skoglund, A., Iliev, B., Kadmiry, B., Palm, R.: Programming by demonstration of pick-and-place tasks for industrial manipulators using task primitives. In: 2007 International Symposium on Computational Intelligence in Robotics and Automation, pp. 368–373. IEEE, Piscataway (2007) Skoglund, A., Iliev, B., Kadmiry, B., Palm, R.: Programming by demonstration of pick-and-place tasks for industrial manipulators using task primitives. In: 2007 International Symposium on Computational Intelligence in Robotics and Automation, pp. 368–373. IEEE, Piscataway (2007)
88.
go back to reference Song, J., Ren, H., Sadigh, D., Ermon, S.: Multi-agent generative adversarial imitation learning. In: Advances in Neural Information Processing Systems, pp. 7461–7472 (2018) Song, J., Ren, H., Sadigh, D., Ermon, S.: Multi-agent generative adversarial imitation learning. In: Advances in Neural Information Processing Systems, pp. 7461–7472 (2018)
89.
go back to reference Speck, C., Bucci, D.J.: Distributed UAV swarm formation control via object-focused, multi-objective SARSA. In: 2018 Annual American Control Conference (ACC), pp. 6596–6601. IEEE, Piscataway (2018) Speck, C., Bucci, D.J.: Distributed UAV swarm formation control via object-focused, multi-objective SARSA. In: 2018 Annual American Control Conference (ACC), pp. 6596–6601. IEEE, Piscataway (2018)
90.
go back to reference Strömbom, D., Mann, R.P., Wilson, A.M., Hailes, S., Morton, A.J., Sumpter, D.J.T., King, A.J.: Solving the shepherding problem: heuristics for herding autonomous, interacting agents. J. R. Soc. Interf. 11(100) (2014). https://browzine.com/articles/52614503 Strömbom, D., Mann, R.P., Wilson, A.M., Hailes, S., Morton, A.J., Sumpter, D.J.T., King, A.J.: Solving the shepherding problem: heuristics for herding autonomous, interacting agents. J. R. Soc. Interf. 11(100) (2014). https://​browzine.​com/​articles/​52614503
91.
go back to reference Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, vol. 1. MIT Press, Cambridge (1998)MATH Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, vol. 1. MIT Press, Cambridge (1998)MATH
92.
go back to reference Szepesvári, C.: Algorithms for reinforcement learning. Synth. Lect. Artif. Intell. Mach. Learn. 4(1), 1–103 (2010)CrossRef Szepesvári, C.: Algorithms for reinforcement learning. Synth. Lect. Artif. Intell. Mach. Learn. 4(1), 1–103 (2010)CrossRef
93.
go back to reference Trentini, M., Beckman, B.: Semi-autonomous UAV/UGV for dismounted urban operations. In: Unmanned Systems Technology XII, vol. 7692, p. 76921C. International Society for Optics and Photonics (2010) Trentini, M., Beckman, B.: Semi-autonomous UAV/UGV for dismounted urban operations. In: Unmanned Systems Technology XII, vol. 7692, p. 76921C. International Society for Optics and Photonics (2010)
94.
go back to reference Tsitsiklis, J.N., Van Roy, B.: Analysis of temporal-diffference learning with function approximation. In: Advances in Neural Information Processing Systems, pp. 1075–1081 (1997) Tsitsiklis, J.N., Van Roy, B.: Analysis of temporal-diffference learning with function approximation. In: Advances in Neural Information Processing Systems, pp. 1075–1081 (1997)
95.
go back to reference Vidal, R., Rashid, S., Sharp, C., Shakernia, O., Kim, J., Sastry, S.: Pursuit-evasion games with unmanned ground and aerial vehicles. In: Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No. 01CH37164), vol. 3, pp. 2948–2955. IEEE, Piscataway (2001) Vidal, R., Rashid, S., Sharp, C., Shakernia, O., Kim, J., Sastry, S.: Pursuit-evasion games with unmanned ground and aerial vehicles. In: Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No. 01CH37164), vol. 3, pp. 2948–2955. IEEE, Piscataway (2001)
96.
go back to reference Waslander, S.L.: Unmanned aerial and ground vehicle teams: Recent work and open problems. In: Autonomous Control Systems and Vehicles, pp. 21–36. Springer, Berlin (2013) Waslander, S.L.: Unmanned aerial and ground vehicle teams: Recent work and open problems. In: Autonomous Control Systems and Vehicles, pp. 21–36. Springer, Berlin (2013)
97.
go back to reference Wulfmeier, M., Ondruska, P., Posner, I.: Maximum entropy deep inverse reinforcement learning (2015). Preprint arXiv:1507.04888 Wulfmeier, M., Ondruska, P., Posner, I.: Maximum entropy deep inverse reinforcement learning (2015). Preprint arXiv:1507.04888
98.
go back to reference Xu, D., Zhang, X., Zhu, Z., Chen, C., Yang, P.: Behavior-based formation control of swarm robots. Math. Problems Eng. 2014, 205759 (2014) Xu, D., Zhang, X., Zhu, Z., Chen, C., Yang, P.: Behavior-based formation control of swarm robots. Math. Problems Eng. 2014, 205759 (2014)
99.
go back to reference Yang, Z., Merrick, K., Jin, L., Abbass, H.A.: Hierarchical deep reinforcement learning for continuous action control. IEEE Trans. Neur. Netw. Learn. Syst. (99), 1–11 (2018)MathSciNet Yang, Z., Merrick, K., Jin, L., Abbass, H.A.: Hierarchical deep reinforcement learning for continuous action control. IEEE Trans. Neur. Netw. Learn. Syst. (99), 1–11 (2018)MathSciNet
100.
go back to reference Yoshikai, T., Otake, N., Mizuuchi, I., Inaba, M., Inoue, H.: Development of an imitation behavior in humanoid kenta with reinforcement learning algorithm based on the attention during imitation. In: 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No. 04CH37566), vol. 2, pp. 1192–1197. IEEE, Piscataway (2004) Yoshikai, T., Otake, N., Mizuuchi, I., Inaba, M., Inoue, H.: Development of an imitation behavior in humanoid kenta with reinforcement learning algorithm based on the attention during imitation. In: 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No. 04CH37566), vol. 2, pp. 1192–1197. IEEE, Piscataway (2004)
101.
go back to reference You, C., Lu, J., Filev, D., Tsiotras, P.: Advanced planning for autonomous vehicles using reinforcement learning and deep inverse reinforcement learning. Rob. Auton. Syst. 114, 1–18 (2019)CrossRef You, C., Lu, J., Filev, D., Tsiotras, P.: Advanced planning for autonomous vehicles using reinforcement learning and deep inverse reinforcement learning. Rob. Auton. Syst. 114, 1–18 (2019)CrossRef
102.
go back to reference Yu, H., Beard, R.W., Argyle, M., Chamberlain, C.: Probabilistic path planning for cooperative target tracking using aerial and ground vehicles. In: Proceedings of the 2011 American Control Conference, pp. 4673–4678. IEEE, Piscataway (2011) Yu, H., Beard, R.W., Argyle, M., Chamberlain, C.: Probabilistic path planning for cooperative target tracking using aerial and ground vehicles. In: Proceedings of the 2011 American Control Conference, pp. 4673–4678. IEEE, Piscataway (2011)
103.
go back to reference Zhang, T., Kahn, G., Levine, S., Abbeel, P.: Learning deep control policies for autonomous aerial vehicles with mpc-guided policy search. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 528–535. IEEE, Piscataway (2016) Zhang, T., Kahn, G., Levine, S., Abbeel, P.: Learning deep control policies for autonomous aerial vehicles with mpc-guided policy search. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 528–535. IEEE, Piscataway (2016)
104.
go back to reference Zhang, T., Li, Q., Zhang, C.s., Liang, H.w., Li, P., Wang, T.m., Li, S., Zhu, Y.l., Wu, C.: Current trends in the development of intelligent unmanned autonomous systems. Front. Inf. Technol. Electron. Eng. 18(1), 68–85 (2017) Zhang, T., Li, Q., Zhang, C.s., Liang, H.w., Li, P., Wang, T.m., Li, S., Zhu, Y.l., Wu, C.: Current trends in the development of intelligent unmanned autonomous systems. Front. Inf. Technol. Electron. Eng. 18(1), 68–85 (2017)
105.
go back to reference Zhan, E., Zheng, S., Yue, Y., Sha, L., Lucey, P.: Generative multi-agent behavioral cloning. In: Proceedings of the 35th International Conference on Machine Learning (2018) Zhan, E., Zheng, S., Yue, Y., Sha, L., Lucey, P.: Generative multi-agent behavioral cloning. In: Proceedings of the 35th International Conference on Machine Learning (2018)
106.
go back to reference Ziebart, B.D., Maas, A.L., Bagnell, J.A., Dey, A.K.: Maximum entropy inverse reinforcement learning. In: Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, vol. 8, pp. 1433–1438. Chicago (2008) Ziebart, B.D., Maas, A.L., Bagnell, J.A., Dey, A.K.: Maximum entropy inverse reinforcement learning. In: Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, vol. 8, pp. 1433–1438. Chicago (2008)
Metadata
Title
Apprenticeship Bootstrapping Reinforcement Learning for Sky Shepherding of a Ground Swarm in Gazebo
Authors
Hung Nguyen
Matthew Garratt
Hussein A. Abbass
Copyright Year
2021
DOI
https://doi.org/10.1007/978-3-030-60898-9_10

Premium Partner