Expanding the atomic-primitives search space for DERL

Develop methods to expand the reward-function search space in Differentiable Evolutionary Reinforcement Learning (DERL) by incorporating more granular or semantically rich atomic primitives, potentially extracted automatically from task descriptions, while maintaining the framework’s effectiveness and stability.

Background

DERL parameterizes reward functions as symbolic compositions of atomic primitives and trains a meta-optimizer to evolve these structures using validation performance as feedback. This design enables differentiable meta-optimization over reward structure rather than black-box evolutionary search.

However, the expressivity of discovered reward functions is currently limited by the predefined set of atomic primitives. The authors note that, although their chosen primitives work well in the evaluated domains, the meta-optimizer cannot invent new functional capabilities beyond this grammar. They explicitly flag expanding this primitive set—possibly via automatic extraction from task descriptions—as an open challenge.

References

Expanding the search space to include more granular or semantically rich primitives—potentially extracted automatically from task descriptions—remains an open challenge.

Differentiable Evolutionary Reinforcement Learning (2512.13399 - Cheng et al., 15 Dec 2025) in Limitations and Future Work