Comparative advantages of actor-only versus critic-only RL paradigms in finance
Determine the comparative advantages of actor-only reinforcement learning methods (such as policy gradient) versus critic-only reinforcement learning methods (such as Q-learning) for financial portfolio optimization by conducting evaluations under consistent benchmarks and experimental setups that allow a rigorous comparison of their performance characteristics.
References
Despite extensive study, the comparative advantages of these paradigms remain unclear due to inconsistent benchmarks and experimental setups in the literature .
— Deep Reinforcement Learning for Optimal Asset Allocation Using DDPG with TiDE
(2508.20103 - Liu et al., 12 Aug 2025) in Section 2: State of the art