Reward function corresponding to human driving behavior
Determine a reward function that corresponds to human driving behavior for use in multi-agent reinforcement learning for autonomous driving, clarifying the target objective that accurately captures how people drive in interactive traffic scenarios.
References
However, it is not entirely clear what reward function corresponds to human driving and the inclusion of this type of reward shaping can create undesired behaviors~\citep{knox2023reward}.
— Human-compatible driving partners through data-regularized self-play reinforcement learning
(2403.19648 - Cornelisse et al., 28 Mar 2024) in Section 4 Related work