Identify and evaluate alternative priority schemes for SCLD’s replay buffer
Investigate alternative prioritization schemes for the prioritized replay buffer used in training Sequential Controlled Langevin Diffusion (SCLD), including prioritization by importance weights via Radon–Nikodym derivatives, and determine their impact on training stability, sample efficiency, and sampling performance.
References
We note that there are many alternative possibilities for choosing the buffer priority (including by importance weight), which we leave to future exploration.
— Sequential Controlled Langevin Diffusions
(2412.07081 - Chen et al., 10 Dec 2024) in Appendix — Algorithmic details and pseudocode: Replay buffers