Comparative performance of GFlowNets vs state-of-the-art RL post-training methods
Determine the performance of GFlowNets-based approaches for post-training diffusion text-to-image models relative to state-of-the-art methods Flow-GRPO and DanceGRPO, using comparable evaluation protocols and metrics for reward-guided alignment and quality.
References
However, performance vs. mainstream SOTA (Flow-GRPO, DanceGRPO) is unknown.
— Finite Difference Flow Optimization for RL Post-Training of Text-to-Image Models
(2603.12893 - McAllister et al., 13 Mar 2026) in Section 2, Previous Work