Skip to main content
Top

2022 | OriginalPaper | Chapter

Hierarchically Structured Scheduling and Execution of Tasks in a Multi-agent Environment

Authors : Diogo Carvalho, Biswa Sengupta

Published in: Progress in Artificial Intelligence

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In a warehouse environment, tasks appear dynamically. Consequently, a task management system that matches them with the workforce too early (e.g., weeks in advance) is necessarily sub-optimal. Also, the rapidly increasing size of the action space of such a system consists of a significant problem for traditional schedulers. Reinforcement learning, however, is suited to deal with issues requiring making sequential decisions towards a long-term, often remote, goal. In this work, we set ourselves on a problem that presents itself with a hierarchical structure: the task-scheduling, by a centralised agent, in a dynamic warehouse multi-agent environment and the execution of one such schedule, by decentralised agents with only partial observability thereof. We propose to use deep reinforcement learning to solve both the high-level scheduling problem and the low-level multi-agent problem of schedule execution. The topic and contribution is relevant to both reinforcement learning and operations research scientific communities and is directed towards future real-world industrial applications.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
3.
go back to reference Bacon, P.L., Harb, J., Precup, D.: The option-critic architecture. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31 (2017) Bacon, P.L., Harb, J., Precup, D.: The option-critic architecture. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31 (2017)
5.
go back to reference Christianos, F., Papoudakis, G., Rahman, M.A., Albrecht, S.V.: Scaling multi-agent reinforcement learning with selective parameter sharing. In: Meila, M., Zhang, T. (eds.) Proceedings of the 38th International Conference on Machine Learning. Proceedings of Machine Learning Research, 18–24 July 2021, vol. 139, pp. 1989–1998. PMLR (2021). https://proceedings.mlr.press/v139/christianos21a.html Christianos, F., Papoudakis, G., Rahman, M.A., Albrecht, S.V.: Scaling multi-agent reinforcement learning with selective parameter sharing. In: Meila, M., Zhang, T. (eds.) Proceedings of the 38th International Conference on Machine Learning. Proceedings of Machine Learning Research, 18–24 July 2021, vol. 139, pp. 1989–1998. PMLR (2021). https://​proceedings.​mlr.​press/​v139/​christianos21a.​html
6.
go back to reference Claes, D., Oliehoek, F., Baier, H., Tuyls, K., et al.: Decentralised online planning for multi-robot warehouse commissioning. In: International Conference on Autonomous Agents and Multiagent Systems, pp. 492–500 (2017) Claes, D., Oliehoek, F., Baier, H., Tuyls, K., et al.: Decentralised online planning for multi-robot warehouse commissioning. In: International Conference on Autonomous Agents and Multiagent Systems, pp. 492–500 (2017)
7.
go back to reference Dayan, P., Hinton, G.E.: Feudal reinforcement learning. In: Hanson, S., Cowan, J., Giles, C. (eds.) Advances in Neural Information Processing Systems, vol. 5. Morgan-Kaufmann (1993) Dayan, P., Hinton, G.E.: Feudal reinforcement learning. In: Hanson, S., Cowan, J., Giles, C. (eds.) Advances in Neural Information Processing Systems, vol. 5. Morgan-Kaufmann (1993)
8.
go back to reference Dietterich, T.G.: Hierarchical reinforcement learning with the maxq value function decomposition. J. Artif. Intell. Res. 13, 227–303 (2000)MathSciNetCrossRef Dietterich, T.G.: Hierarchical reinforcement learning with the maxq value function decomposition. J. Artif. Intell. Res. 13, 227–303 (2000)MathSciNetCrossRef
10.
go back to reference Fluri, C., Ruch, C., Zilly, J., Hakenberg, J., Frazzoli, E.: Learning to operate a fleet of cars. In: IEEE Intelligent Transportation Systems Conference (ITSC) (2019) Fluri, C., Ruch, C., Zilly, J., Hakenberg, J., Frazzoli, E.: Learning to operate a fleet of cars. In: IEEE Intelligent Transportation Systems Conference (ITSC) (2019)
11.
go back to reference Foerster, J., Assael, I.A., de Freitas, N., Whiteson, S.: Learning to communicate with deep multi-agent reinforcement learning. In: Lee, D., Sugiyama, M., Luxburg, U., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 29. Curran Associates, Inc. (2016) Foerster, J., Assael, I.A., de Freitas, N., Whiteson, S.: Learning to communicate with deep multi-agent reinforcement learning. In: Lee, D., Sugiyama, M., Luxburg, U., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 29. Curran Associates, Inc. (2016)
12.
go back to reference Gammelli, D., Yang, K., Harrison, J., Rodrigues, F., Pereira, F.C., Pavone, M.: Graph neural network reinforcement learning for autonomous mobility-on-demand systems. arXiv preprint arXiv:2104.11434 (2021) Gammelli, D., Yang, K., Harrison, J., Rodrigues, F., Pereira, F.C., Pavone, M.: Graph neural network reinforcement learning for autonomous mobility-on-demand systems. arXiv preprint arXiv:​2104.​11434 (2021)
13.
go back to reference Guériau, M., Dusparic, I.: Samod: Shared autonomous mobility-on-demand using decentralized reinforcement learning. In: 2018 21st International Conference on Intelligent Transportation Systems (ITSC). IEEE (2018) Guériau, M., Dusparic, I.: Samod: Shared autonomous mobility-on-demand using decentralized reinforcement learning. In: 2018 21st International Conference on Intelligent Transportation Systems (ITSC). IEEE (2018)
15.
go back to reference Holler, J., et al.: Deep reinforcement learning for multi-driver vehicle dispatching and repositioning problem. In: 2019 IEEE International Conference on Data Mining (ICDM), pp. 1090–1095. IEEE (2019) Holler, J., et al.: Deep reinforcement learning for multi-driver vehicle dispatching and repositioning problem. In: 2019 IEEE International Conference on Data Mining (ICDM), pp. 1090–1095. IEEE (2019)
16.
go back to reference Hu, Y., Yao, Y., Lee, W.S.: A reinforcement learning approach for optimizing multiple traveling salesman problems over graphs. Knowl.-Based Syst. 204, 106244 (2020)CrossRef Hu, Y., Yao, Y., Lee, W.S.: A reinforcement learning approach for optimizing multiple traveling salesman problems over graphs. Knowl.-Based Syst. 204, 106244 (2020)CrossRef
17.
go back to reference Kaempfer, Y., Wolf, L.: Learning the multiple traveling salesmen problem with permutation invariant pooling networks. arXiv preprint arXiv:1803.09621 (2018) Kaempfer, Y., Wolf, L.: Learning the multiple traveling salesmen problem with permutation invariant pooling networks. arXiv preprint arXiv:​1803.​09621 (2018)
18.
go back to reference Kong, X., Xin, B., Liu, F., Wang, Y.: Revisiting the master-slave architecture in multi-agent deep reinforcement learning. arXiv preprint arXiv:1712.07305 (2017) Kong, X., Xin, B., Liu, F., Wang, Y.: Revisiting the master-slave architecture in multi-agent deep reinforcement learning. arXiv preprint arXiv:​1712.​07305 (2017)
19.
go back to reference Kraemer, L., Banerjee, B.: Multi-agent reinforcement learning as a rehearsal for decentralized planning. Neurocomputing 190, 82–94 (2016)CrossRef Kraemer, L., Banerjee, B.: Multi-agent reinforcement learning as a rehearsal for decentralized planning. Neurocomputing 190, 82–94 (2016)CrossRef
20.
go back to reference Lei, Z., Qian, X., Ukkusuri, S.V.: Efficient proactive vehicle relocation for on-demand mobility service with recurrent neural networks. Transp. Res. Part C: Emerg. Technol. 117, 102678 (2020)CrossRef Lei, Z., Qian, X., Ukkusuri, S.V.: Efficient proactive vehicle relocation for on-demand mobility service with recurrent neural networks. Transp. Res. Part C: Emerg. Technol. 117, 102678 (2020)CrossRef
21.
go back to reference Liang, E., et al.: RLlib: abstractions for distributed reinforcement learning. In: International Conference on Machine Learning (ICML) (2018) Liang, E., et al.: RLlib: abstractions for distributed reinforcement learning. In: International Conference on Machine Learning (ICML) (2018)
22.
go back to reference Lin, K., Zhao, R., Xu, Z., Zhou, J.: Efficient large-scale fleet management via multi-agent deep reinforcement learning. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (2018) Lin, K., Zhao, R., Xu, Z., Zhou, J.: Efficient large-scale fleet management via multi-agent deep reinforcement learning. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (2018)
23.
go back to reference Liu, N., et al.: A hierarchical framework of cloud resource allocation and power management using deep reinforcement learning. In: 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), pp. 372–382. IEEE (2017) Liu, N., et al.: A hierarchical framework of cloud resource allocation and power management using deep reinforcement learning. In: 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), pp. 372–382. IEEE (2017)
24.
go back to reference Makar, R., Mahadevan, S., Ghavamzadeh, M.: Hierarchical multi-agent reinforcement learning. In: Proceedings of the Fifth International Conference on Autonomous Agents, pp. 246–253 (2001) Makar, R., Mahadevan, S., Ghavamzadeh, M.: Hierarchical multi-agent reinforcement learning. In: Proceedings of the Fifth International Conference on Autonomous Agents, pp. 246–253 (2001)
25.
go back to reference Mao, H., Alizadeh, M., Menache, I., Kandula, S.: Resource management with deep reinforcement learning. In: Proceedings of the 15th ACM Workshop on Hot Topics in Networks, pp. 50–56 (2016) Mao, H., Alizadeh, M., Menache, I., Kandula, S.: Resource management with deep reinforcement learning. In: Proceedings of the 15th ACM Workshop on Hot Topics in Networks, pp. 50–56 (2016)
26.
go back to reference Mao, H., Schwarzkopf, M., Venkatakrishnan, S.B., Meng, Z., Alizadeh, M.: Learning scheduling algorithms for data processing clusters. In: Proceedings of the ACM Special Interest Group on Data Communication, pp. 270–288 (2019) Mao, H., Schwarzkopf, M., Venkatakrishnan, S.B., Meng, Z., Alizadeh, M.: Learning scheduling algorithms for data processing clusters. In: Proceedings of the ACM Special Interest Group on Data Communication, pp. 270–288 (2019)
27.
go back to reference Ming, G.F., Hua, S.: Course-scheduling algorithm of option-based hierarchical reinforcement learning. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 1, pp. 288–291. IEEE (2010) Ming, G.F., Hua, S.: Course-scheduling algorithm of option-based hierarchical reinforcement learning. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 1, pp. 288–291. IEEE (2010)
28.
go back to reference Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)CrossRef Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)CrossRef
29.
go back to reference Nachum, O., Gu, S.S., Lee, H., Levine, S.: Data-efficient hierarchical reinforcement learning. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 31. Curran Associates, Inc. (2018) Nachum, O., Gu, S.S., Lee, H., Levine, S.: Data-efficient hierarchical reinforcement learning. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 31. Curran Associates, Inc. (2018)
30.
go back to reference Papoudakis, G., Christianos, F., Rahman, A., Albrecht, S.V.: Dealing with non-stationarity in multi-agent deep reinforcement learning. arXiv preprint arXiv:1906.04737 (2019) Papoudakis, G., Christianos, F., Rahman, A., Albrecht, S.V.: Dealing with non-stationarity in multi-agent deep reinforcement learning. arXiv preprint arXiv:​1906.​04737 (2019)
31.
go back to reference Parr, R., Russell, S.: Reinforcement learning with hierarchies of machines. Advances in Neural Information Processing Systems, pp. 1043–1049 (1998) Parr, R., Russell, S.: Reinforcement learning with hierarchies of machines. Advances in Neural Information Processing Systems, pp. 1043–1049 (1998)
32.
go back to reference Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017) Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:​1707.​06347 (2017)
33.
go back to reference Sutton, R.S., Precup, D., Singh, S.: Between mdps and semi-mdps: a framework for temporal abstraction in reinforcement learning. Artif. Intell. 112(1–2), 181–211 (1999)MathSciNetCrossRef Sutton, R.S., Precup, D., Singh, S.: Between mdps and semi-mdps: a framework for temporal abstraction in reinforcement learning. Artif. Intell. 112(1–2), 181–211 (1999)MathSciNetCrossRef
34.
go back to reference Tang, H., et al.: Hierarchical deep multiagent reinforcement learning with temporal abstraction. arXiv preprint arXiv:1809.09332 (2018) Tang, H., et al.: Hierarchical deep multiagent reinforcement learning with temporal abstraction. arXiv preprint arXiv:​1809.​09332 (2018)
35.
go back to reference Terry, J.K., Grammel, N., Hari, A., Santos, L., Black, B.: Revisiting parameter sharing in multi-agent deep reinforcement learning. arXiv preprint arXiv:2005.13625 (2020) Terry, J.K., Grammel, N., Hari, A., Santos, L., Black, B.: Revisiting parameter sharing in multi-agent deep reinforcement learning. arXiv preprint arXiv:​2005.​13625 (2020)
36.
go back to reference Ye, H., Li, G.Y.: Deep reinforcement learning for resource allocation in v2v communications. In: 2018 IEEE International Conference on Communications (ICC), pp. 1–6. IEEE (2018) Ye, H., Li, G.Y.: Deep reinforcement learning for resource allocation in v2v communications. In: 2018 IEEE International Conference on Communications (ICC), pp. 1–6. IEEE (2018)
Metadata
Title
Hierarchically Structured Scheduling and Execution of Tasks in a Multi-agent Environment
Authors
Diogo Carvalho
Biswa Sengupta
Copyright Year
2022
DOI
https://doi.org/10.1007/978-3-031-16474-3_2

Premium Partner