Process for creating compressed reward functions from example outcomes
Elucidate the process by which people create compressed reward functions from raw example outcomes, specifying the algorithmic and representational steps that yield simplified mappings from outcome features to scalar rewards and clarifying whether rule-based, similarity-based, or mixed strategies are employed.
References
Furthermore, the process by which people create compressed reward functions from raw example outcomes remains to be elucidated.
— Reward function compression facilitates goal-dependent reinforcement learning
(2509.06810 - Molinaro et al., 8 Sep 2025) in Discussion