Skip to main content
Top
Published in: World Wide Web 4/2020

20-04-2020

Rebalancing the car-sharing system with reinforcement learning

Authors: Changwei Ren, Lixingjian An, Zhanquan Gu, Yuexuan Wang, Yunjun Gao

Published in: World Wide Web | Issue 4/2020

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

With the sharing economy boom, there is a notable increase in the number of car-sharing corporations, which provided a variety of travel options and improved convenience and functionality. Owing to the similarity in the travel patterns of the urban population, car-sharing system often faces the problem of imbalance in the number of shared cars within the spatial distribution, especially during the rush hours. There are many challenges in redressing this imbalance, such as insufficient data and the large state space. In this study, we propose a new reward method called Double P (Picking & Parking) Bonus (DPB). We model the research problem as a Markov Decision Process (MDP) problem and introduce Deep Deterministic Policy Gradient, a state-of-the-art reinforcement learning framework, to find a solution. The results show that the rewarding mechanism embodied in the DPB method can indeed guide the users’ behaviors through price leverage, increase user stickiness, and cultivate user habits, thereby boosting the service provider’s long-term profit. In addition, taking the battery power of the shared car into consideration, we use the method of hierarchical reinforcement learning for station scheduling. This station scheduling method encourages the user to place the car that needs to be charged on the charging post within a certain site. It can ensure the effective use of charging pile resources, thereby rendering the efficient functioning of shared cars.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Andre, D., Russell, S.J.: State abstraction for programmable reinforcement learning agents[C]. AAAI/IAAI, pp. 119–125 (2002) Andre, D., Russell, S.J.: State abstraction for programmable reinforcement learning agents[C]. AAAI/IAAI, pp. 119–125 (2002)
2.
go back to reference Cai, Q., Filos-Ratsikas, A., Tang, P., et al.: Reinforcement Mechanism Design for e-commerce[C]. Proceedings of the 2018 World Wide Web Conference. International World Wide Web Conferences Steering Committee, pp. 1339–1348 (2018) Cai, Q., Filos-Ratsikas, A., Tang, P., et al.: Reinforcement Mechanism Design for e-commerce[C]. Proceedings of the 2018 World Wide Web Conference. International World Wide Web Conferences Steering Committee, pp. 1339–1348 (2018)
3.
go back to reference Chemla, D., Meunier, F., Pradeau, T., Calvo, R.W., Yahiaoui, H.: Self-service bike sharing systems: simulation, repositioning pricing (2013) Chemla, D., Meunier, F., Pradeau, T., Calvo, R.W., Yahiaoui, H.: Self-service bike sharing systems: simulation, repositioning pricing (2013)
4.
go back to reference Dayan, P., Hinton, G.E.: Feudal reinforcement learning[C]//Advances in neural information processing systems, pp. 271–278 (1993) Dayan, P., Hinton, G.E.: Feudal reinforcement learning[C]//Advances in neural information processing systems, pp. 271–278 (1993)
5.
go back to reference Dean, T., Lin, S.H.: Decomposition techniques for planning in stochastic domains[C]. IJCAI 2, 3 (1995) Dean, T., Lin, S.H.: Decomposition techniques for planning in stochastic domains[C]. IJCAI 2, 3 (1995)
6.
go back to reference Dietterich, T.G.: The MAXQ Method for Hierarchical Reinforcement Learning[C]. ICML 98, 118–126 (1998) Dietterich, T.G.: The MAXQ Method for Hierarchical Reinforcement Learning[C]. ICML 98, 118–126 (1998)
7.
go back to reference Dietterich, T.G.: Hierarchical reinforcement learning with the MAXQ value function decomposition[J]. J. Artif. Intell. Res. 13, 227–303 (2000)MathSciNetCrossRef Dietterich, T.G.: Hierarchical reinforcement learning with the MAXQ value function decomposition[J]. J. Artif. Intell. Res. 13, 227–303 (2000)MathSciNetCrossRef
8.
go back to reference Fricker, C., Gast, N.: Incentives and redistribution in homogeneous bike-sharing systems with stations of finite capacity. Euro J. Transp. Logist. 5(3), 261–291 (2016)CrossRef Fricker, C., Gast, N.: Incentives and redistribution in homogeneous bike-sharing systems with stations of finite capacity. Euro J. Transp. Logist. 5(3), 261–291 (2016)CrossRef
10.
go back to reference Hochreiter, S., Schmidhuber, J.: Long short-term memory[J]. Neural Comput. 9(8), 1735–1780 (1997)CrossRef Hochreiter, S., Schmidhuber, J.: Long short-term memory[J]. Neural Comput. 9(8), 1735–1780 (1997)CrossRef
11.
go back to reference Kaelbling, L.P.: Hierarchical learning in stochastic domains: Preliminary results[C]. Proc. Tenth Int. Conf. Mach. Learn. 951, 167–173 (1993) Kaelbling, L.P.: Hierarchical learning in stochastic domains: Preliminary results[C]. Proc. Tenth Int. Conf. Mach. Learn. 951, 167–173 (1993)
12.
13.
go back to reference Lillicrap, T.P., Hunt, J.J., Pritzel, A., et al.: Continuous control with deep reinforcement learning. Comput. Sci. 8(6), A187 (2016) Lillicrap, T.P., Hunt, J.J., Pritzel, A., et al.: Continuous control with deep reinforcement learning. Comput. Sci. 8(6), A187 (2016)
14.
go back to reference Li, Y., Yu, Z., Yang, Q.: Dynamic Bike Reposition: A SpatioTemporal Reinforcement Learning Approach. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1724–1733. ACM (2018) Li, Y., Yu, Z., Yang, Q.: Dynamic Bike Reposition: A SpatioTemporal Reinforcement Learning Approach. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1724–1733. ACM (2018)
15.
go back to reference Liu, J., Sun, L., Chen, W., Xiong, H.: Rebalancing Bike Sharing Systems: A Multi-source Data Smart Optimization. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1005–1014. ACM (2016) Liu, J., Sun, L., Chen, W., Xiong, H.: Rebalancing Bike Sharing Systems: A Multi-source Data Smart Optimization. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1005–1014. ACM (2016)
16.
go back to reference Mnih, V., Kavukcuoglu, K., Silver, D., et al.: Playing Atari with deep reinforcement learning. Proceedings of Workshops at the 26th Neural Information Processing Systems 2013. Lake Tahoe, pp. 201–220 (2013) Mnih, V., Kavukcuoglu, K., Silver, D., et al.: Playing Atari with deep reinforcement learning. Proceedings of Workshops at the 26th Neural Information Processing Systems 2013. Lake Tahoe, pp. 201–220 (2013)
17.
go back to reference Mnih, V., Kavukcuoglu, K., Silver, D., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)CrossRef Mnih, V., Kavukcuoglu, K., Silver, D., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)CrossRef
18.
go back to reference Ning, W., Wenjian, Z., Xiang, L., Jing, Z.: Inter-Site-Vehicle Artificial scheduling strategy design for electric vehicle Sharing[J]. J. Tongji Univ. (Nat. Sci.) 46 (8), 1064–1071 (2018)MATH Ning, W., Wenjian, Z., Xiang, L., Jing, Z.: Inter-Site-Vehicle Artificial scheduling strategy design for electric vehicle Sharing[J]. J. Tongji Univ. (Nat. Sci.) 46 (8), 1064–1071 (2018)MATH
19.
go back to reference Ning, W., Yajing, S., Linhao, T., WenJian, Z.: Adaptive Scheduling Strategy in Car-sharing System Based on Feedback Dynamic Pricing. J. Transp. Syst. Eng. Inf. Technol. 18(5), 12–17 (2018) Ning, W., Yajing, S., Linhao, T., WenJian, Z.: Adaptive Scheduling Strategy in Car-sharing System Based on Feedback Dynamic Pricing. J. Transp. Syst. Eng. Inf. Technol. 18(5), 12–17 (2018)
20.
go back to reference O’Mahony, E., Shmoys, D.B: Data analysis and optimization for (citi) bike sharing. In: AAAI, pp. 687–694 (2015) O’Mahony, E., Shmoys, D.B: Data analysis and optimization for (citi) bike sharing. In: AAAI, pp. 687–694 (2015)
21.
go back to reference Pan, L., Cai, Q., Fang, Z., et al.: A Deep Reinforcement Learning Framework for Rebalancing Dockless Bike Sharing Systems[J]. arXiv:1802.04592(2018) Pan, L., Cai, Q., Fang, Z., et al.: A Deep Reinforcement Learning Framework for Rebalancing Dockless Bike Sharing Systems[J]. arXiv:1802.​04592(2018)
22.
go back to reference Sergey, I., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167 (2015) Sergey, I., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv:1502.​03167 (2015)
23.
go back to reference Silver, D., Lever, G., Hess, N., et al.: Deterministic policy gradient algorithms. Proceedings of the International Conference on Machine Learning. Beijing, pp. 387–395 (2014) Silver, D., Lever, G., Hess, N., et al.: Deterministic policy gradient algorithms. Proceedings of the International Conference on Machine Learning. Beijing, pp. 387–395 (2014)
24.
go back to reference Sutton, R.S., Barto, AG.: Reinforcement learning: an Introduction. MIT Press, Cambridge (1998)MATH Sutton, R.S., Barto, AG.: Reinforcement learning: an Introduction. MIT Press, Cambridge (1998)MATH
25.
go back to reference Sutton, R.S., McAllister, D.A., Singh, S.P., et al.: Policy gradient methods for reinforcement learning with function approximation. Proceedings of the Advances in Neural Information Processing Systems, Denver, pp. 1057–1063 (1999) Sutton, R.S., McAllister, D.A., Singh, S.P., et al.: Policy gradient methods for reinforcement learning with function approximation. Proceedings of the Advances in Neural Information Processing Systems, Denver, pp. 1057–1063 (1999)
26.
go back to reference Singla, A., Santoni, M., ok, Gabor B., Mukerji, P., Meenen, M., Krause, A.: Incentivizing users for balancing bike sharing systems. In: AAAI, pp. 723–729, Austin, Texas (2015) Singla, A., Santoni, M., ok, Gabor B., Mukerji, P., Meenen, M., Krause, A.: Incentivizing users for balancing bike sharing systems. In: AAAI, pp. 723–729, Austin, Texas (2015)
27.
go back to reference Van Seijen, H., et al.: Hybrid reward architecture for reinforcement learning. Advances in Neural Information Processing Systems (2017) Van Seijen, H., et al.: Hybrid reward architecture for reinforcement learning. Advances in Neural Information Processing Systems (2017)
28.
go back to reference Watkins, C.J.C.H.: Learning from delayed rewards. Robot. Auton. Syst. 15(4), 233–235 (1989) Watkins, C.J.C.H.: Learning from delayed rewards. Robot. Auton. Syst. 15(4), 233–235 (1989)
Metadata
Title
Rebalancing the car-sharing system with reinforcement learning
Authors
Changwei Ren
Lixingjian An
Zhanquan Gu
Yuexuan Wang
Yunjun Gao
Publication date
20-04-2020
Publisher
Springer US
Published in
World Wide Web / Issue 4/2020
Print ISSN: 1386-145X
Electronic ISSN: 1573-1413
DOI
https://doi.org/10.1007/s11280-020-00804-z

Other articles of this Issue 4/2020

World Wide Web 4/2020 Go to the issue

Premium Partner