Scalable uncertainty-aware RL reward design for truthfulness in LLMs
Develop scalable reward signal formulations for reinforcement learning that reliably capture the truthfulness of large language models while balancing accuracy and uncertainty, ensuring the reward structure supports truthful behavior across tasks and scales.
References
Despite these advances, designing scalable reward signals that reliably capture truthfulness while balancing accuracy and uncertainty remains an open challenge.
— TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning
(2509.25760 - Wei et al., 30 Sep 2025) in Section 6.2 Reinforcement Learning for LLMs (Related Work)