Process for creating compressed reward functions from example outcomes

Elucidate the process by which people create compressed reward functions from raw example outcomes, specifying the algorithmic and representational steps that yield simplified mappings from outcome features to scalar rewards and clarifying whether rule-based, similarity-based, or mixed strategies are employed.

Background

The authors introduce the concept of a compressed reward function—an efficient, simplified rule that maps outcome features to rewards—and show behavior and modeling consistent with its formation and use to improve learning efficiency. Despite this, the exact process by which compressed reward functions are constructed from repeated goal-outcome experiences is unspecified.

They suggest that insights from function learning (e.g., rule-based versus similarity-based strategies) might inform this process, but acknowledge that the mechanism remains unresolved and warrants direct investigation.

References

Furthermore, the process by which people create compressed reward functions from raw example outcomes remains to be elucidated.

— Reward function compression facilitates goal-dependent reinforcement learning (2509.06810 - Molinaro et al., 8 Sep 2025) in Discussion

Process for creating compressed reward functions from example outcomes

Sponsor

Background

References

Related Problems