Principled reward design for reinforcement learning in autonomous driving

Develop a principled methodology for designing reward functions for reinforcement learning-based autonomous driving that effectively guides learning in dynamic traffic while avoiding suboptimal rule-based heuristics and aligning with evaluation metrics.

Background

The paper surveys prior reinforcement learning approaches for driving and highlights that many works rely on complex, shaped rewards built from multiple rule-based terms, which can introduce local minima and cap performance. In contrast, the authors propose a simple reward centered on route completion, but note that, broadly, how to design effective rewards for driving remains unresolved.

They explicitly characterize reward design as an open problem in the literature, motivating the need for frameworks that balance learnability, scalability, and alignment with task metrics without embedding limitations from rule-based components.

References

Due to the flexibility in optimization, there are many ways to define the reward for driving, making reward design an important open problem .

— CaRL: Learning Scalable Planning Policies with Simple Rewards (2504.17838 - Jaeger et al., 24 Apr 2025) in Appendix, Section 'Related work', subsubsection 'Rewards for Driving'

Principled reward design for reinforcement learning in autonomous driving

Sponsor

Background

References

Related Problems