More effective training strategies for masking-type corruptions in Ambient Diffusion Omni

Determine whether more effective training strategies exist for handling corruption types such as masking when training diffusion models using the Ambient Diffusion Omni (Ambient-o) framework, which currently performs well primarily under high-frequency corruptions.

Background

Ambient Diffusion Omni (Ambient-o) trains diffusion models from mixed-quality data by leveraging two principles: at high noise levels, additive Gaussian noise contracts distributional differences, enabling learning from low-quality data; at low noise levels, denoising can rely on local crops, allowing the use of out-of-distribution images whose patch statistics match the target distribution.

The authors note that this approach is particularly effective for high-frequency degradations (e.g., blur), but less so for corruptions that alter low-frequency content or structure. Among such challenging cases, masking—where content is missing or occluded—poses difficulty because added Gaussian noise does not easily neutralize the discrepancy and local patch statistics may not align. This motivates the explicit open question regarding whether more effective training strategies can be devised for these corruption types within the Ambient-o framework.

References

Algorithmically, while our method performs well under high-frequency corruptions, it remains an open question whether more effective training strategies could be used for different types of corruptions (e.g., masking).

— Ambient Diffusion Omni: Training Good Models with Bad Data (2506.10038 - Daras et al., 10 Jun 2025) in Section 6, Limitations and Future Work

More effective training strategies for masking-type corruptions in Ambient Diffusion Omni

Sponsor

Background

References

Related Problems