Fully general relative continuity of eUDRL asymptotic accumulation points at deterministic kernels

Establish a fully general theory of relative continuity for the sets of accumulation points of policies generated by episodic Upside-Down Reinforcement Learning (eUDRL) at deterministic transition kernels, beyond the special cases treated in the paper.

Background

In the asymptotic analysis, the paper proves relative continuity of accumulation point sets for eUDRL policies under two important special cases and fully addresses the regularized recursion. Nevertheless, the fully general case at deterministic kernels remains unresolved.

Solving this problem would complete the asymptotic stability and continuity theory for eUDRL by removing current restrictions on the environment or uniqueness assumptions.

References

Although we believe that the outlined conditions encompass a wide range of practical scenarios, a fully general discussion of the relative continuity of accumulation point sets for eUDRL-generated policies at deterministic kernels remains an open problem.

— On the Convergence and Stability of Upside-Down Reinforcement Learning, Goal-Conditioned Supervised Learning, and Online Decision Transformers (2502.05672 - Štrupl et al., 8 Feb 2025) in Conclusion

Fully general relative continuity of eUDRL asymptotic accumulation points at deterministic kernels

Sponsor

Background

References

Related Problems