Skip to main content
Top
Published in: Peer-to-Peer Networking and Applications 5/2022

20-07-2022

DHL: Deep reinforcement learning-based approach for emergency supply distribution in humanitarian logistics

Authors: Junchao Fan, Xiaolin Chang, Jelena Mišić, Vojislav B. Mišić, Hongyue Kang

Published in: Peer-to-Peer Networking and Applications | Issue 5/2022

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Alleviating human suffering in disasters is one of the main objectives of humanitarian logistics. The lack of emergency rescue materials is the root cause of this suffering and must be considered when making emergency supply distribution decision. As large-scale disasters often cause varying degrees of damage to different influenced areas, which will cause differences in both human suffering and the demand for emergency supply in influenced areas. This paper considers a novel emergency supply distribution scenario in humanitarian logistics, which takes into account these differences. In the scenario, besides the economic goals such as minimizing costs, the humanitarian goal of alleviating the suffering of survivors is treated as one of the main bases of emergency supply distribution decision making. We first apply Markov Decision Process to establish the formulation of the emergency supply distribution problem. Then, to acquire the optimal resource allocation policy that can reduce the economic cost while decreasing the suffering of survivors, a Deep Q-Network-based approach for emergency supply distribution in Humanitarian Logistics (DHL) is developed. Numerical results demonstrate DHL has better performance and lower time complexity to solve the problem by comparing with other baselines.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
3.
go back to reference Nappi MML, Souza JC (2015) Disaster management: hierarchical structuring criteria for selection and location of temporary shelters. Nat Hazards 75:2421–2436CrossRef Nappi MML, Souza JC (2015) Disaster management: hierarchical structuring criteria for selection and location of temporary shelters. Nat Hazards 75:2421–2436CrossRef
4.
go back to reference Das R, Hanaoka S (2014) An agent-based model for resource allocation during relief distribution. J Humanit Logist Supply Chain Manag 4(2):265–285CrossRef Das R, Hanaoka S (2014) An agent-based model for resource allocation during relief distribution. J Humanit Logist Supply Chain Manag 4(2):265–285CrossRef
5.
go back to reference Fiedrich F, Gehbauer F, Rickers U (2000) Optimized resource allocation for emergency response after earthquake disasters. Safety Sci 35(1–3):41–57CrossRef Fiedrich F, Gehbauer F, Rickers U (2000) Optimized resource allocation for emergency response after earthquake disasters. Safety Sci 35(1–3):41–57CrossRef
6.
go back to reference Wex F, Schryen G, Feuerriegel S, Neumann D (2014) Emergency response in natural disaster management: Allocation and scheduling of rescue units. Eur J Oper Res 235(3):697–708MathSciNetCrossRef Wex F, Schryen G, Feuerriegel S, Neumann D (2014) Emergency response in natural disaster management: Allocation and scheduling of rescue units. Eur J Oper Res 235(3):697–708MathSciNetCrossRef
7.
go back to reference Alem D, Clark A, Moreno A (2016) Stochastic network models for logistics planning in disaster relief. Eur J Oper Res 255(1):187–206MathSciNetCrossRef Alem D, Clark A, Moreno A (2016) Stochastic network models for logistics planning in disaster relief. Eur J Oper Res 255(1):187–206MathSciNetCrossRef
8.
go back to reference Chen YX, Tadikamalla PR, Shang J, Song Y (2020) Supply allocation: bi-level programming and differential evolution algorithm for Natural Disaster Relief. Clust Comput 23(1):203–217CrossRef Chen YX, Tadikamalla PR, Shang J, Song Y (2020) Supply allocation: bi-level programming and differential evolution algorithm for Natural Disaster Relief. Clust Comput 23(1):203–217CrossRef
9.
go back to reference Wang Y, Sun B (2021) Multiperiod optimal emergency material allocation considering road network damage and risk under uncertain conditions. Oper Res 1–36 Wang Y, Sun B (2021) Multiperiod optimal emergency material allocation considering road network damage and risk under uncertain conditions. Oper Res 1–36
10.
go back to reference Yu L, Zhang C, Jiang J, Yang H, Shang H (2021) Reinforcement learning approach for resource allocation in humanitarian logistics. Expert Syst Appl 173 Yu L, Zhang C, Jiang J, Yang H, Shang H (2021) Reinforcement learning approach for resource allocation in humanitarian logistics. Expert Syst Appl 173
11.
go back to reference Yu L, Yang H, Miao L, Zhang C (2018) Rollout algorithms for resource allocation in humanitarian logistics. IISE Trans 51(8):887–909CrossRef Yu L, Yang H, Miao L, Zhang C (2018) Rollout algorithms for resource allocation in humanitarian logistics. IISE Trans 51(8):887–909CrossRef
12.
go back to reference Yu L, Zhang C, Yang H, Miao L (2018) Novel methods for resource allocation in humanitarian logistics considering human suffering. Comput Ind Eng 119:1–20CrossRef Yu L, Zhang C, Yang H, Miao L (2018) Novel methods for resource allocation in humanitarian logistics considering human suffering. Comput Ind Eng 119:1–20CrossRef
13.
go back to reference Silva MA, Leiras A (2021) The Deprivation Cost in Humanitarian Logistics: A Systematic Review. In: International Joint conference on Industrial Engineering and Operations Management. pp 279–301 Silva MA, Leiras A (2021) The Deprivation Cost in Humanitarian Logistics: A Systematic Review. In: International Joint conference on Industrial Engineering and Operations Management. pp 279–301
14.
go back to reference Sutton RS, Barto AG (1998) Reinforcement learning: An introduction. MIT press, CambridgeMATH Sutton RS, Barto AG (1998) Reinforcement learning: An introduction. MIT press, CambridgeMATH
15.
go back to reference Kohl N, Stone P (2004) Policy gradient reinforcement learning for fast quadrupedal locomotion. In: IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA'04. 2004, vol 3. IEEE, pp 2619–2624 Kohl N, Stone P (2004) Policy gradient reinforcement learning for fast quadrupedal locomotion. In: IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA'04. 2004, vol 3. IEEE, pp 2619–2624
16.
go back to reference Tesauro G (1995) Temporal difference learning and TD-Gammon. Commun ACM 38(3):58–68CrossRef Tesauro G (1995) Temporal difference learning and TD-Gammon. Commun ACM 38(3):58–68CrossRef
17.
go back to reference Strehl AL, Li L, Wiewiora E, Langford J, Littman ML (2006) PAC model-free reinforcement learning. In: Proceedings of the 23rd international conference on Machine learning. pp 881–888 Strehl AL, Li L, Wiewiora E, Langford J, Littman ML (2006) PAC model-free reinforcement learning. In: Proceedings of the 23rd international conference on Machine learning. pp 881–888
18.
go back to reference Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA (2017) Deep reinforcement learning: A brief survey. IEEE Signal Process Mag 34(6):26–38CrossRef Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA (2017) Deep reinforcement learning: A brief survey. IEEE Signal Process Mag 34(6):26–38CrossRef
19.
go back to reference He Y, Zhao N, Yin H (2017) Integrated networking, caching, and computing for connected vehicles: A deep reinforcement learning approach. IEEE Trans Veh Technol 67(1):44–55CrossRef He Y, Zhao N, Yin H (2017) Integrated networking, caching, and computing for connected vehicles: A deep reinforcement learning approach. IEEE Trans Veh Technol 67(1):44–55CrossRef
20.
go back to reference Xiong X, Zheng K, Lei L, Hou L (2020) Resource allocation based on deep reinforcement learning in IoT edge computing. IEEE J Sel Areas Commun 38(6):1133–1146CrossRef Xiong X, Zheng K, Lei L, Hou L (2020) Resource allocation based on deep reinforcement learning in IoT edge computing. IEEE J Sel Areas Commun 38(6):1133–1146CrossRef
21.
go back to reference Yu P, Zhou F, Zhang X, Qiu X, Kadoch M, Cheriet M (2020) Deep learning-based resource allocation for 5G broadband TV service. IEEE Trans Broadcast 66(4):800–813CrossRef Yu P, Zhou F, Zhang X, Qiu X, Kadoch M, Cheriet M (2020) Deep learning-based resource allocation for 5G broadband TV service. IEEE Trans Broadcast 66(4):800–813CrossRef
22.
go back to reference Hu X, Liu S, Chen R, Wang W, Wang C (2018) A deep reinforcement learning-based framework for dynamic resource allocation in multibeam satellite systems. IEEE Commun Lett 22(8):1612–1615CrossRef Hu X, Liu S, Chen R, Wang W, Wang C (2018) A deep reinforcement learning-based framework for dynamic resource allocation in multibeam satellite systems. IEEE Commun Lett 22(8):1612–1615CrossRef
23.
go back to reference Xiong Z, Zhang Y, Niyato D, Deng R, Wang P, Wang LC (2019) Deep reinforcement learning for mobile 5G and beyond: Fundamentals, applications, and challenges. IEEE Veh Technol Mag 14(2):44–52CrossRef Xiong Z, Zhang Y, Niyato D, Deng R, Wang P, Wang LC (2019) Deep reinforcement learning for mobile 5G and beyond: Fundamentals, applications, and challenges. IEEE Veh Technol Mag 14(2):44–52CrossRef
24.
go back to reference Zhang Y, Yao J, Guan H (2017) Intelligent cloud resource management with deep reinforcement learning. IEEE Cloud Comput 4(6):60–69CrossRef Zhang Y, Yao J, Guan H (2017) Intelligent cloud resource management with deep reinforcement learning. IEEE Cloud Comput 4(6):60–69CrossRef
25.
go back to reference Liu N, Li Z, Xu J, Xu Z, Lin S, Qiu Q, ... Wang Y (2017) A hierarchical framework of cloud resource allocation and power management using deep reinforcement learning. In: 2017 IEEE 37th international conference on distributed computing systems (ICDCS). IEEE, pp 372–382 Liu N, Li Z, Xu J, Xu Z, Lin S, Qiu Q, ... Wang Y (2017) A hierarchical framework of cloud resource allocation and power management using deep reinforcement learning. In: 2017 IEEE 37th international conference on distributed computing systems (ICDCS). IEEE, pp 372–382
26.
go back to reference Du Y, Zhang F, Xue L (2018) A kind of joint routing and resource allocation scheme based on prioritized memories-deep Q network for cognitive radio ad hoc networks. Sensors 18(7):2119CrossRef Du Y, Zhang F, Xue L (2018) A kind of joint routing and resource allocation scheme based on prioritized memories-deep Q network for cognitive radio ad hoc networks. Sensors 18(7):2119CrossRef
27.
go back to reference Xiong Z, Zhang Y, Lim WYB, Kang J, Niyato D, Leung C, Miao C (2020) UAV-assisted wireless energy and data transfer with deep reinforcement learning. IEEE Trans Cogn Commun Netw 7(1):85–99CrossRef Xiong Z, Zhang Y, Lim WYB, Kang J, Niyato D, Leung C, Miao C (2020) UAV-assisted wireless energy and data transfer with deep reinforcement learning. IEEE Trans Cogn Commun Netw 7(1):85–99CrossRef
28.
go back to reference Zhang W, Yang D, Wu W, Peng H, Zhang N, Zhang H, Shen X (2021) Optimizing federated learning in distributed industrial iot: A multi-agent approach. IEEE J Sel Areas Commun 39(12):3688–3703CrossRef Zhang W, Yang D, Wu W, Peng H, Zhang N, Zhang H, Shen X (2021) Optimizing federated learning in distributed industrial iot: A multi-agent approach. IEEE J Sel Areas Commun 39(12):3688–3703CrossRef
29.
go back to reference Sheu JB (2007) An emergency logistics distribution approach for quick response to urgent relief demand in disasters. Transp Res Part E: Logist Transp Rev 43(6):687–709CrossRef Sheu JB (2007) An emergency logistics distribution approach for quick response to urgent relief demand in disasters. Transp Res Part E: Logist Transp Rev 43(6):687–709CrossRef
30.
go back to reference Huang K, Jiang Y, Yuan Y, Zhao L (2015) Modeling multiple humanitarian objectives in emergency response to large-scale disasters. Transp Res Part E: Logist Transp Rev 75:1–17CrossRef Huang K, Jiang Y, Yuan Y, Zhao L (2015) Modeling multiple humanitarian objectives in emergency response to large-scale disasters. Transp Res Part E: Logist Transp Rev 75:1–17CrossRef
31.
go back to reference Holguín-Veras J, Pérez N, Jaller M, Van Wassenhove LN, Aros-Vera F (2013) On the appropriate objective function for post-disaster humanitarian logistics models. J Oper Manag 31(5):262–280CrossRef Holguín-Veras J, Pérez N, Jaller M, Van Wassenhove LN, Aros-Vera F (2013) On the appropriate objective function for post-disaster humanitarian logistics models. J Oper Manag 31(5):262–280CrossRef
32.
go back to reference Wiering MA, Van Otterlo M (2012) Reinforcement learning: State-of-the-Art. Springer, Berlin, GermanyCrossRef Wiering MA, Van Otterlo M (2012) Reinforcement learning: State-of-the-Art. Springer, Berlin, GermanyCrossRef
33.
go back to reference Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: A survey. J Artif Intell Res 4:237–285CrossRef Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: A survey. J Artif Intell Res 4:237–285CrossRef
34.
go back to reference Watkins CJCH (1989) Learning from delayed rewards. PhD Thesis, University of Cambridge, England Watkins CJCH (1989) Learning from delayed rewards. PhD Thesis, University of Cambridge, England
35.
go back to reference Qiu C, Wang X, Yao H, Du J, Yu FR, Guo S (2020) Networking Integrated Cloud–Edge–End in IoT: A Blockchain-Assisted Collective Q-Learning Approach. IEEE Internet Things J 8(16):12694–12704CrossRef Qiu C, Wang X, Yao H, Du J, Yu FR, Guo S (2020) Networking Integrated Cloud–Edge–End in IoT: A Blockchain-Assisted Collective Q-Learning Approach. IEEE Internet Things J 8(16):12694–12704CrossRef
36.
go back to reference Mohammed A, Nahom H, Tewodros A, Habtamu Y, Hayelom G (2020) Deep reinforcement learning for computation offloading and resource allocation in blockchain-based multi-UAV-enabled mobile edge computing. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP). IEEE, pp 295–299 Mohammed A, Nahom H, Tewodros A, Habtamu Y, Hayelom G (2020) Deep reinforcement learning for computation offloading and resource allocation in blockchain-based multi-UAV-enabled mobile edge computing. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP). IEEE, pp 295–299
37.
go back to reference Ke HC, Wang H, Zhao HW, Sun WJ (2021) Deep reinforcement learning-based computation offloading and resource allocation in security-aware mobile edge computing. Wirel Net 27(5):3357–3373CrossRef Ke HC, Wang H, Zhao HW, Sun WJ (2021) Deep reinforcement learning-based computation offloading and resource allocation in security-aware mobile edge computing. Wirel Net 27(5):3357–3373CrossRef
38.
go back to reference Zhang R, Xiong K, Lu Y, Gao B, Fan P, Letaief KB (2022) Joint Coordinated Beamforming and Power Splitting Ratio Optimization in MU-MISO SWIPT-Enabled HetNets: A Multi-Agent DDQN-Based Approach. IEEE J Sel Areas Commun 40(2):677–693CrossRef Zhang R, Xiong K, Lu Y, Gao B, Fan P, Letaief KB (2022) Joint Coordinated Beamforming and Power Splitting Ratio Optimization in MU-MISO SWIPT-Enabled HetNets: A Multi-Agent DDQN-Based Approach. IEEE J Sel Areas Commun 40(2):677–693CrossRef
39.
go back to reference Šemrov D, Marsetič R, Žura M, Todorovski L, Srdic A (2016) Reinforcement learning approach for train rescheduling on a single-track railway. Transport Res Part B: Meth 86:250–267CrossRef Šemrov D, Marsetič R, Žura M, Todorovski L, Srdic A (2016) Reinforcement learning approach for train rescheduling on a single-track railway. Transport Res Part B: Meth 86:250–267CrossRef
40.
go back to reference Konar A, Chakraborty IG, Singh SJ, Jain LC, Nagar AK (2013) A deterministic improved Q-learning for path planning of a mobile robot. IEEE Transact Syst Man Cyber: Syst 43(5):1141–1153CrossRef Konar A, Chakraborty IG, Singh SJ, Jain LC, Nagar AK (2013) A deterministic improved Q-learning for path planning of a mobile robot. IEEE Transact Syst Man Cyber: Syst 43(5):1141–1153CrossRef
Metadata
Title
DHL: Deep reinforcement learning-based approach for emergency supply distribution in humanitarian logistics
Authors
Junchao Fan
Xiaolin Chang
Jelena Mišić
Vojislav B. Mišić
Hongyue Kang
Publication date
20-07-2022
Publisher
Springer US
Published in
Peer-to-Peer Networking and Applications / Issue 5/2022
Print ISSN: 1936-6442
Electronic ISSN: 1936-6450
DOI
https://doi.org/10.1007/s12083-022-01353-0

Other articles of this Issue 5/2022

Peer-to-Peer Networking and Applications 5/2022 Go to the issue

Premium Partner