Effect of test-time averaging on poor-quality image reconstructions

Ascertain whether aggregating decoder predictions through subject-level and instance-level averaging of predicted CLIP-Image, CLIP-Text, and AutoKL embeddings prior to diffusion-based image generation improves low-quality reconstructions from EEG, MEG, 3T fMRI, and 7T fMRI recordings, and characterize the conditions under which such averaging yields benefits.

Background

The authors evaluate image reconstruction by conditioning a diffusion model on predicted embeddings (CLIP-Image, CLIP-Text, AutoKL) and show that averaging predictions across repetitions or subjects generally improves reconstruction metrics and qualitative outcomes for many cases. However, they observe variability in benefits across reconstruction quality tiers.

Specifically, while averaging consistently helps high- and medium-quality reconstructions, its effect on low-quality (failure) cases remains uncertain. Determining whether and when aggregation strategies materially improve poor reconstructions would guide practical deployment and post-processing for real-time or low-SNR regimes.

References

These results confirm that aggregating predictions benefits high- and medium-quality reconstructions, though it is unclear whether it actually benefits bad reconstructions.

— Scaling laws for decoding images from brain activity (2501.15322 - Banville et al., 25 Jan 2025) in Appendix, Section “Quality of image reconstructions across devices and test-time averaging strategies”

Effect of test-time averaging on poor-quality image reconstructions

Background

References

Related Problems