Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
157 tokens/sec
GPT-4o
43 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Reinforcement Learning Approach for Dynamic Rebalancing in Bike-Sharing System (2402.03589v1)

Published 5 Feb 2024 in cs.LG and math.OC

Abstract: Bike-Sharing Systems provide eco-friendly urban mobility, contributing to the alleviation of traffic congestion and to healthier lifestyles. Efficiently operating such systems and maintaining high customer satisfaction is challenging due to the stochastic nature of trip demand, leading to full or empty stations. Devising effective rebalancing strategies using vehicles to redistribute bikes among stations is therefore of uttermost importance for operators. As a promising alternative to classical mathematical optimization, reinforcement learning is gaining ground to solve sequential decision-making problems. This paper introduces a spatio-temporal reinforcement learning algorithm for the dynamic rebalancing problem with multiple vehicles. We first formulate the problem as a Multi-agent Markov Decision Process in a continuous time framework. This allows for independent and cooperative vehicle rebalancing, eliminating the impractical restriction of time-discretized models where vehicle departures are synchronized. A comprehensive simulator under the first-arrive-first-serve rule is then developed to facilitate the learning process by computing immediate rewards under diverse demand scenarios. To estimate the value function and learn the rebalancing policy, various Deep Q-Network configurations are tested, minimizing the lost demand. Experiments are carried out on various datasets generated from historical data, affected by both temporal and weather factors. The proposed algorithms outperform benchmarks, including a multi-period Mixed-Integer Programming model, in terms of lost demand. Once trained, it yields immediate decisions, making it suitable for real-time applications. Our work offers practical insights for operators and enriches the integration of reinforcement learning into dynamic rebalancing problems, paving the way for more intelligent and robust urban mobility solutions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (46)
  1. O. Alagoz, H. Hsu, A. J. Schaefer, and M. S. Roberts, “Markov decision processes: a tool for sequential decision making under uncertainty,” Medical Decision Making, vol. 30, no. 4, pp. 474–483, 2010.
  2. R. Bellman, “A markovian decision process,” Journal of mathematics and mechanics, pp. 679–684, 1957.
  3. R. Bellman and R. Kalaba, “On adaptive control processes,” IRE Transactions on Automatic Control, vol. 4, no. 2, pp. 1–9, 1959.
  4. I. Bello, H. Pham, Q. V. Le, M. Norouzi, and S. Bengio, “Neural combinatorial optimization with reinforcement learning,” arXiv preprint arXiv:1611.09940, 2016.
  5. Y. Bengio, A. Lodi, and A. Prouvost, “Machine learning for combinatorial optimization: a methodological tour d’horizon,” European Journal of Operational Research, vol. 290, no. 2, pp. 405–421, 2021.
  6. J. Brinkmann, M. W. Ulmer, and D. C. Mattfeld, “Dynamic lookahead policies for stochastic-dynamic inventory routing in bike sharing systems,” Computers & Operations Research, vol. 106, pp. 260–279, 2019.
  7. ——, “The multi-vehicle stochastic-dynamic inventory routing problem for bike sharing systems,” Business Research, vol. 13, no. 1, pp. 69–92, 2020.
  8. F. Chiariotti, C. Pielli, A. Zanella, and M. Zorzi, “A dynamic approach to rebalancing bike-sharing systems,” Sensors, vol. 18, no. 2, p. 512, 2018.
  9. Y. Du, F. Deng, and F. Liao, “A model framework for discovering the spatio-temporal usage patterns of public free-floating bike-sharing system,” Transportation Research Part C: Emerging Technologies, vol. 103, pp. 39–55, 2019.
  10. Y. Duan and J. Wu, “Optimizing rebalance scheme for dock-less bike sharing systems with adaptive user incentive,” in 2019 20th IEEE International Conference on Mobile Data Management (MDM).   IEEE, 2019, pp. 176–181.
  11. D. J. Fagnant and K. M. Kockelman, “Dynamic ride-sharing and fleet sizing for a system of shared autonomous vehicles in austin, texas,” Transportation, vol. 45, pp. 143–158, 2018.
  12. S. Ghosh, J. Y. Koh, and P. Jaillet, “Improving customer satisfaction in bike sharing systems through dynamic repositioning,” in Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19, 2019, pp. 5864–5870.
  13. Z. Haider, A. Nikolaev, J. E. Kang, and C. Kwon, “Inventory rebalancing through pricing in public bike sharing systems,” European Journal of Operational Research, vol. 270, no. 1, pp. 103–117, 2018.
  14. S.-S. Ho, M. Schofield, and N. Wang, “Learning incentivization strategy for resource rebalancing in shared services with a budget constraint,” Journal of Applied and Numerical Optimization, vol. 3, no. 1, pp. 105–114, 2021.
  15. W. Hönig, T. Kumar, L. Cohen, H. Ma, H. Xu, N. Ayanian, and S. Koenig, “Multi-agent path finding with kinematic constraints,” in Proceedings of the International Conference on Automated Planning and Scheduling, vol. 26, 2016, pp. 477–485.
  16. B. Legros, “Dynamic repositioning strategy in a bike-sharing system; how to prioritize and how to rebalance a bike station,” European Journal of Operational Research, vol. 272, no. 2, pp. 740–753, 2019.
  17. J. Z. Leibo, V. Zambaldi, M. Lanctot, J. Marecki, and T. Graepel, “Multi-agent reinforcement learning in sequential social dilemmas,” arXiv preprint arXiv:1702.03037, 2017.
  18. G. Li, N. Cao, P. Zhu, Y. Zhang, Y. Zhang, L. Li, Q. Li, and Y. Zhang, “Towards smart transportation system: A case study on the rebalancing problem of bike sharing system based on reinforcement learning,” Journal of Organizational and End User Computing (JOEUC), vol. 33, no. 3, pp. 35–49, 2021.
  19. Y. Li, Y. Zheng, and Q. Yang, “Dynamic bike reposition: A spatio-temporal reinforcement learning approach,” in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018, pp. 1724–1733.
  20. J. Liang, S. D. Jena, and A. Lodi, “Dynamic rebalancing optimization for bike-sharing systems: A modeling framework and empirical comparison,” GERAD, HEC Montréal, Canada, Les Cahiers du GERAD G–2023–47, 2023.
  21. J. Liang, M. C. Martins Silva, D. Aloise, and S. D. Jena, “Dynamic rebalancing for bike-sharing systems under inventory interval and target predictions,” CIRRELT, Canada, Research Paper CIRRELT-2023-37, 2023.
  22. D. Liu, M. Fischetti, and A. Lodi, “Learning to search in local branching,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 4, pp. 3796–3803, Jun. 2022.
  23. M. Lowalekar, P. Varakantham, S. Ghosh, S. D. Jena, and P. Jaillet, “Online repositioning in bike sharing systems,” in Twenty-Seventh International Conference on Automated Planning and Scheduling, 2017.
  24. C.-C. Lu, “Robust multi-period fleet allocation models for bike-sharing systems,” Networks and Spatial Economics, vol. 16, no. 1, pp. 61–82, 2016.
  25. X. Luo, L. Li, L. Zhao, and J. Lin, “Dynamic intra-cell repositioning in free-floating bike-sharing systems using approximate dynamic programming,” Transportation Science, 2022.
  26. A. Mahajan, T. Rashid, M. Samvelyan, and S. Whiteson, “Maven: Multi-agent variational exploration,” Advances in neural information processing systems, vol. 32, 2019.
  27. K. Mellou and P. Jaillet, “Dynamic resource redistribution and demand estimation: An application to bike sharing systems,” Available at SSRN 3336416, 2019.
  28. M. Moravčík, M. Schmid, N. Burch, V. Lisỳ, D. Morrill, N. Bard, T. Davis, K. Waugh, M. Johanson, and M. Bowling, “Deepstack: Expert-level artificial intelligence in heads-up no-limit poker,” Science, vol. 356, no. 6337, pp. 508–513, 2017.
  29. V. Nair and G. E. Hinton, “Rectified linear units improve restricted boltzmann machines,” in Proceedings of the 27th international conference on machine learning (ICML-10), 2010, pp. 807–814.
  30. A. Oroojlooy and D. Hajinezhad, “A review of cooperative multi-agent deep reinforcement learning,” arXiv preprint arXiv:1908.03963, 2019.
  31. O. O’Brien, P. DeMaio, R. Rabello, S. Chou, and T. Benicchio, “The meddin bike-sharing world map report,” https://bikesharingworldmap.com/reports/bswm_mid2022report.pdf, 2022, accessed: 2023-08-04.
  32. L. Pan, Q. Cai, Z. Fang, P. Tang, and L. Huang, “A deep reinforcement learning framework for rebalancing dockless bike sharing systems,” in Proceedings of the AAAI conference on artificial intelligence, vol. 33, no. 01, 2019, pp. 1393–1400.
  33. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, “PyTorch: An Imperative Style, High-Performance Deep Learning Library,” in Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d’Alché Buc, E. Fox, and R. Garnett, Eds.   Curran Associates, Inc., 2019, pp. 8024–8035.
  34. D. Pedamonti, “Comparison of non-linear activation functions for deep neural networks on mnist classification task,” arXiv preprint arXiv:1804.02763, 2018.
  35. T. Raviv, M. Tzur, and I. A. Forma, “Static repositioning in a bike-sharing system: models and solution approaches,” EURO Journal on Transportation and Logistics, vol. 2, no. 3, pp. 187–229, 2013.
  36. M. Schofield, S.-S. Ho, and N. Wang, “Handling rebalancing problem in shared mobility services via reinforcement learning-based incentive mechanism,” in 2021 IEEE International Intelligent Transportation Systems Conference (ITSC).   IEEE, 2021, pp. 3381–3386.
  37. Y. Seo, “A dynamic rebalancing strategy in public bicycle sharing systems based on real-time dynamic programming and reinforcement learning,” Ph.D. dissertation, Doctoral dissertation. Seoul National University, South Korea, 2020.
  38. Y.-H. Seo, D.-K. Kim, S. Kang, Y.-J. Byon, and S.-Y. Kho, “Rebalancing docked bicycle sharing system with approximate dynamic programming and reinforcement learning,” Journal of Advanced Transportation, vol. 2022, 2022.
  39. C. M. Vallez, M. Castro, and D. Contreras, “Challenges and opportunities in dock-based bike-sharing rebalancing: a systematic review,” Sustainability, vol. 13, no. 4, p. 1829, 2021.
  40. I. Xiao, “A distributed reinforcement learning solution with knowledge transfer capability for a bike rebalancing problem,” arXiv preprint arXiv:1810.04058, 2018.
  41. C. Xu, J. Ji, and P. Liu, “The station-free sharing bike demand forecasting with a deep learning approach and large-scale datasets,” Transportation research part C: emerging technologies, vol. 95, pp. 47–60, 2018.
  42. Y. Yang, R. Luo, M. Li, M. Zhou, W. Zhang, and J. Wang, “Mean field multi-agent reinforcement learning,” in International conference on machine learning.   PMLR, 2018, pp. 5571–5580.
  43. Z. Yin, Z. Kou, and H. Cai, “A deep reinforcement learning model for large-scale dynamic bike share rebalancing with spatial-temporal context,” in The 12th International Workshop on Urban Computing, 2023.
  44. D. Zhang, C. Yu, J. Desai, H. Lau, and S. Srivathsan, “A time-space network flow approach to dynamic repositioning in bicycle sharing systems,” Transportation research part B: methodological, vol. 103, pp. 188–207, 2017.
  45. X. Zhang, H. Yang, R. Zheng, Z. Jin, and B. Zhou, “A dynamic shared bikes rebalancing method based on demand prediction,” in 2019 IEEE Intelligent Transportation Systems Conference (ITSC).   IEEE, 2019, pp. 238–244.
  46. X. Zheng, M. Tang, Y. Liu, Z. Xian, and H. H. Zhuo, “Repositioning bikes with carrier vehicles and bike trailers in bike sharing systems,” Applied Sciences, vol. 11, no. 16, p. 7227, 2021.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets