Convergence theory for PG-DPO and costate estimation in the Two-Stage variant
Investigate the theoretical convergence properties of the Pontryagin-Guided Direct Policy Optimization (PG-DPO) algorithm used for continuous-time multi-asset portfolio optimization, with a specific focus on characterizing the convergence behavior of the costate estimation process produced via backpropagation-through-time in the Two-Stage PG-DPO variant.
Sponsor
References
Furthermore, investigating the theoretical convergence properties of the PG-DPO algorithm, particularly the costate estimation process within the 2-PG-DPO variant, remains an important open question.
— Breaking the Dimensional Barrier: A Pontryagin-Guided Direct Policy Optimization for Continuous-Time Multi-Asset Portfolio
(2504.11116 - Huh et al., 15 Apr 2025) in Section 5 (Conclusion)