Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Fully Data-Driven Approach for Realistic Traffic Signal Control Using Offline Reinforcement Learning (2311.15920v1)

Published 27 Nov 2023 in cs.AI

Abstract: The optimization of traffic signal control (TSC) is critical for an efficient transportation system. In recent years, reinforcement learning (RL) techniques have emerged as a popular approach for TSC and show promising results for highly adaptive control. However, existing RL-based methods suffer from notably poor real-world applicability and hardly have any successful deployments. The reasons for such failures are mostly due to the reliance on over-idealized traffic simulators for policy optimization, as well as using unrealistic fine-grained state observations and reward signals that are not directly obtainable from real-world sensors. In this paper, we propose a fully Data-Driven and simulator-free framework for realistic Traffic Signal Control (D2TSC). Specifically, we combine well-established traffic flow theory with machine learning to construct a reward inference model to infer the reward signals from coarse-grained traffic data. With the inferred rewards, we further propose a sample-efficient offline RL method to enable direct signal control policy learning from historical offline datasets of real-world intersections. To evaluate our approach, we collect historical traffic data from a real-world intersection, and develop a highly customized simulation environment that strictly follows real data characteristics. We demonstrate through extensive experiments that our approach achieves superior performance over conventional and offline RL baselines, and also enjoys much better real-world applicability.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (69)
  1. Reinforcement learning for true adaptive traffic signal control. Journal of Transportation Engineering, 129(3): 278–285.
  2. Uncertainty-based offline reinforcement learning with diversified q-ensemble. Advances in neural information processing systems, 34: 7436–7447.
  3. Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning. In International Conference on Learning Representations.
  4. Berger, J. O. 2013. Statistical decision theory and Bayesian analysis. Springer Science & Business Media.
  5. Pattern recognition and machine learning, volume 4. Springer.
  6. Adaptive traffic signal control using approximate dynamic programming. Transportation Research Part C: Emerging Technologies, 17(5): 456–474.
  7. Toward a thousand lights: Decentralized deep reinforcement learning for large-scale traffic signal control. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, 3414–3421.
  8. Multi-agent deep reinforcement learning for large-scale traffic signal control. IEEE Transactions on Intelligent Transportation Systems, 21(3): 1086–1095.
  9. Self-organizing traffic lights: A realistic simulation. Advances in applied self-organizing systems, 45–55.
  10. Daganzo, C. F. 1997. Fundamentals of transportation and traffic operations. Emerald Group Publishing Limited.
  11. IG-RL: Inductive graph reinforcement learning for massive-scale traffic signal control. IEEE Transactions on Intelligent Transportation Systems.
  12. D4rl: Datasets for deep data-driven reinforcement learning. arXiv preprint arXiv:2004.07219.
  13. A minimalist approach to offline reinforcement learning. Advances in neural information processing systems, 34: 20132–20145.
  14. Addressing function approximation error in actor-critic methods. In International conference on machine learning, 1587–1596. PMLR.
  15. Off-policy deep reinforcement learning without exploration. In International Conference on Machine Learning, 2052–2062. PMLR.
  16. Extreme Q-Learning: MaxEnt RL without Entropy. In The Eleventh International Conference on Learning Representations.
  17. A multi-band approach to arterial traffic signal optimization. Transportation Research Part B: Methodological, 25(1): 55–74.
  18. Reinforcement learning with deep energy-based policies. In International conference on machine learning, 1352–1361. PMLR.
  19. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International conference on machine learning, 1861–1870. PMLR.
  20. Idql: Implicit q-learning as an actor-critic method with diffusion policies. arXiv preprint arXiv:2304.10573.
  21. The SCOOT on-line traffic signal optimisation technique. Traffic Engineering & Control, 23(4).
  22. Actor-attention-critic for multi-agent reinforcement learning. In International Conference on Machine Learning, 2961–2970.
  23. Jin, W.-L. 2015. Point queue models: A unified approach. Transportation Research Part B: Methodological, 77: 1–16.
  24. Deep reinforcement learning for autonomous driving: A survey. IEEE Transactions on Intelligent Transportation Systems, 23(6): 4909–4926.
  25. Offline Reinforcement Learning with Implicit Q-Learning. In International Conference on Learning Representations.
  26. Conservative q-learning for offline reinforcement learning. Advances in Neural Information Processing Systems, 33: 1179–1191.
  27. Offline Reinforcement Learning for Road Traffic Control. arXiv preprint arXiv:2201.02381.
  28. Offline reinforcement learning: Tutorial, review, and perspectives on open problems. arXiv preprint arXiv:2005.01643.
  29. Mind the Gap: Offline Policy Optimization for Imperfect Rewards. In The Eleventh International Conference on Learning Representations.
  30. When Data Geometry Meets Deep Function: Generalizing Offline Reinforcement Learning. In The Eleventh International Conference on Learning Representations.
  31. Traffic signal timing via deep reinforcement learning. IEEE/CAA Journal of Automatica Sinica, 3(3): 247–254.
  32. A deep reinforcement learning network for traffic light cycle control. IEEE Transactions on Vehicular Technology, 68(2): 1243–1253.
  33. On kinematic waves II. A theory of traffic flow on long crowded roads. Proceedings of the royal society of london. series a. mathematical and physical sciences, 229(1178): 317–345.
  34. Continuous control with deep reinforcement learning. In The Eleventh International Conference on Learning Representations.
  35. Microscopic traffic simulation using sumo. In IEEE International Conference on Intelligent Transportation Systems (ITSC), 2575–2582.
  36. Lowrie, P. 1990. Scats, sydney co-ordinated adaptive traffic system: A traffic responsive method of controlling urban traffic.
  37. Optimization Model of Regional Green Wave Coordination Control for the Coordinated Path Set. IEEE Transactions on Intelligent Transportation Systems.
  38. Asynchronous methods for deep reinforcement learning. In International conference on machine learning, 1928–1937. PMLR.
  39. CVLight: Decentralized learning for adaptive traffic signal control with connected vehicles. Transportation research part C: emerging technologies, 141: 103728.
  40. Awac: Accelerating online reinforcement learning with offline datasets. arXiv preprint arXiv:2006.09359.
  41. Newell, G. F. 1993. A simplified theory of kinematic waves in highway traffic, part II: Queueing at freeway bottlenecks. Transportation Research Part B: Methodological, 27(4): 289–303.
  42. When to Trust Your Simulator: Dynamics-Aware Hybrid Offline-and-Online Reinforcement Learning. In Advances in Neural Information Processing Systems.
  43. Attendlight: Universal attention-based reinforcement learning model for traffic signal control. Advances in Neural Information Processing Systems, 33: 4079–4090.
  44. Pomerleau, D. A. 1988. Alvinn: An autonomous land vehicle in a neural network. Advances in neural information processing systems, 1.
  45. Richards, P. I. 1956. Shock waves on the highway. Operations research, 4(1): 42–51.
  46. Towards Robust Deep Reinforcement Learning for Traffic Signal Control: Demand Surges, Incidents and Sensor Failures. In 2019 IEEE Intelligent Transportation Systems Conference (ITSC), 3559–3566. IEEE.
  47. Adaptive Traffic Signal Control With Deep Reinforcement Learning and High Dimensional Sensory Inputs: Case Study and Comprehensive Sensitivity Analyses. IEEE Transactions on Intelligent Transportation Systems.
  48. The Sydney coordinated adaptive traffic (SCAT) system philosophy and benefits. IEEE Transactions on vehicular technology, 29(2): 130–137.
  49. S4rl: Surprisingly simple self-supervision for offline reinforcement learning in robotics. In Conference on Robot Learning, 907–917. PMLR.
  50. Coordinated deep reinforcement learners for traffic light control. Proceedings of learning, inference and control of multi-agent systems (at NIPS 2016), 8: 21–38.
  51. Webster, F. 1958. Traffic Signal Settings. Road Research Technical Paper No. 39.
  52. Colight: Learning network-level cooperation for traffic signal control. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 1913–1922.
  53. Recent advances in reinforcement learning for traffic signal control: A survey of models and evaluation. ACM SIGKDD Explorations Newsletter, 22(2): 12–18.
  54. Intellilight: A reinforcement learning approach for intelligent traffic light control. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2496–2505.
  55. Intelligent traffic light control.
  56. Behavior regularized offline reinforcement learning. arXiv preprint arXiv:1911.11361.
  57. Learning traffic signal control from demonstrations. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2289–2292.
  58. A policy-guided imitation approach for offline reinforcement learning. Advances in Neural Information Processing Systems, 35: 4085–4098.
  59. Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization. In The Eleventh International Conference on Learning Representations.
  60. Offline reinforcement learning with soft behavior regularization. arXiv preprint arXiv:2110.07395.
  61. Network-level multiband signal coordination scheme based on vehicle trajectory data. Transportation Research Part C: Emerging Technologies, 107: 266–286.
  62. Transferable traffic signal control: Reinforcement learning with graph centric state representation. Transportation Research Part C: Emerging Technologies, 130: 103321.
  63. MetaLight: Value-Based Meta-Reinforcement Learning for Traffic Signal Control. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, 1153–1160.
  64. Link-based traffic state estimation and prediction for arterial networks using license-plate recognition data. Transportation Research Part C: Emerging Technologies, 117: 102660.
  65. DeepThermal: Combustion Optimization for Thermal Power Generating Units Using Offline Reinforcement Learning. In Proceedings of the AAAI Conference on Artificial Intelligence.
  66. Cityflow: A multi-agent reinforcement learning environment for large scale city traffic scenario. In The world wide web conference, 3620–3624.
  67. Data Might be Enough: Bridge Real-World Traffic Signal Control Using Offline Reinforcement Learning. arXiv preprint arXiv:2303.10828.
  68. Learning phase competition for traffic signal control. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 1963–1972.
  69. Diagnosing reinforcement learning for traffic signal control. arXiv preprint arXiv:1905.04716.

Summary

We haven't generated a summary for this paper yet.