Skip to main content
Erschienen in:

2021 | OriginalPaper | Buchkapitel

RP-DQN: An Application of Q-Learning to Vehicle Routing Problems

verfasst von : Ahmad Bdeir, Simon Boeder, Tim Dernedde, Kirill Tkachuk, Jonas K. Falkner, Lars Schmidt-Thieme

Erschienen in: KI 2021: Advances in Artificial Intelligence

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper we present a new approach to tackle complex routing problems with an improved state representation that utilizes the model complexity better than previous methods. We enable this by training from temporal differences. Specifically Q-Learning is employed. We show that our approach achieves state-of-the-art performance for autoregressive policies that sequentially insert nodes to construct solutions on the Capacitated Vehicle Routing Problem (CVRP). Additionally, we are the first to tackle the Multiple Depot Vehicle Routing Problem (MDVRP) with Reinforcement Learning (RL) and demonstrate that this problem type greatly benefits from our approach over other Machine Learning (ML) methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Chen, X., Tian, Y.: Learning to perform local rewriting for combinatorial optimization. In: Advances in Neural Information Processing Systems, vol. 32. Curran Associates, Inc. (2019) Chen, X., Tian, Y.: Learning to perform local rewriting for combinatorial optimization. In: Advances in Neural Information Processing Systems, vol. 32. Curran Associates, Inc. (2019)
3.
Zurück zum Zitat Dai, H., Dai, B., Song, L.: Discriminative embeddings of latent variable models for structured data. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning, ICML 2016, vol. 48, pp. 2702–2711 (2016) Dai, H., Dai, B., Song, L.: Discriminative embeddings of latent variable models for structured data. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning, ICML 2016, vol. 48, pp. 2702–2711 (2016)
4.
Zurück zum Zitat Delarue, A., Anderson, R., Tjandraatmadja, C.: Reinforcement learning with combinatorial actions: an application to vehicle routing. In: Advances in Neural Information Processing Systems, vol. 33, pp. 609–620. Curran Associates, Inc. (2020) Delarue, A., Anderson, R., Tjandraatmadja, C.: Reinforcement learning with combinatorial actions: an application to vehicle routing. In: Advances in Neural Information Processing Systems, vol. 33, pp. 609–620. Curran Associates, Inc. (2020)
6.
Zurück zum Zitat Gurobi Optimization, LLC: Gurobi optimizer reference manual (2021) Gurobi Optimization, LLC: Gurobi optimizer reference manual (2021)
7.
Zurück zum Zitat Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: ICML (2018) Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: ICML (2018)
8.
Zurück zum Zitat van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, AAAI 2016, pp. 2094–2100. AAAI Press (2016) van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, AAAI 2016, pp. 2094–2100. AAAI Press (2016)
10.
Zurück zum Zitat Hessel, M., et al.: Rainbow: combining improvements in deep reinforcement learning. In: AAAI (2018) Hessel, M., et al.: Rainbow: combining improvements in deep reinforcement learning. In: AAAI (2018)
11.
Zurück zum Zitat Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on Machine Learning, Proceedings of Machine Learning Research, Lille, France, vol. 37, pp. 448–456. PMLR (2015) Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on Machine Learning, Proceedings of Machine Learning Research, Lille, France, vol. 37, pp. 448–456. PMLR (2015)
13.
Zurück zum Zitat Khalil, E., Dai, H., Zhang, Y., Dilkina, B., Song, L.: Learning combinatorial optimization algorithms over graphs. In: Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc. (2017) Khalil, E., Dai, H., Zhang, Y., Dilkina, B., Song, L.: Learning combinatorial optimization algorithms over graphs. In: Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc. (2017)
15.
Zurück zum Zitat Kool, W., van Hoof, H., Welling, M.: Attention, learn to solve routing problems! In: International Conference on Learning Representations (2019) Kool, W., van Hoof, H., Welling, M.: Attention, learn to solve routing problems! In: International Conference on Learning Representations (2019)
18.
Zurück zum Zitat Nazari, M.R., Oroojlooy, A., Snyder, L., Takac, M.: Reinforcement learning for solving the vehicle routing problem. In: Advances in Neural Information Processing Systems, vol. 31. Curran Associates, Inc. (2018) Nazari, M.R., Oroojlooy, A., Snyder, L., Takac, M.: Reinforcement learning for solving the vehicle routing problem. In: Advances in Neural Information Processing Systems, vol. 31. Curran Associates, Inc. (2018)
19.
Zurück zum Zitat Perron, L., Furnon, V.: OR-Tools 7.2 (2019) Perron, L., Furnon, V.: OR-Tools 7.2 (2019)
23.
Zurück zum Zitat Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. The MIT Press (2018) Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. The MIT Press (2018)
24.
Zurück zum Zitat Toth, P., Vigo, D.: Vehicle Routing: Problems, Methods, and Applications, 2nd edn. No. 18 in MOS-SIAM Series on Optimization, SIAM (2014). ISBN 9781611973587 Toth, P., Vigo, D.: Vehicle Routing: Problems, Methods, and Applications, 2nd edn. No. 18 in MOS-SIAM Series on Optimization, SIAM (2014). ISBN 9781611973587
25.
Zurück zum Zitat Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30, Curran Associates, Inc. (2017) Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30, Curran Associates, Inc. (2017)
26.
Zurück zum Zitat Vinyals, O., Fortunato, M., Jaitly, N.: Pointer networks. In: Advances in Neural Information Processing Systems, vol. 28, Curran Associates, Inc. (2015) Vinyals, O., Fortunato, M., Jaitly, N.: Pointer networks. In: Advances in Neural Information Processing Systems, vol. 28, Curran Associates, Inc. (2015)
28.
Zurück zum Zitat Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3–4), 229–256 (1992)MATH Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8(3–4), 229–256 (1992)MATH
Metadaten
Titel
RP-DQN: An Application of Q-Learning to Vehicle Routing Problems
verfasst von
Ahmad Bdeir
Simon Boeder
Tim Dernedde
Kirill Tkachuk
Jonas K. Falkner
Lars Schmidt-Thieme
Copyright-Jahr
2021
DOI
https://doi.org/10.1007/978-3-030-87626-5_1