Convert VLM 2AFC judgments into an optimizable perceptual metric for GAN-based compression
Determine a principled procedure to convert binary two-alternative-forced-choice (2AFC) judgments of visual similarity produced by vision-language models into an optimizable, differentiable perceptual metric that can be used to train generative adversarial network (GAN)–based perceptually oriented image compression models.
Sponsor
References
However, it is not clear how to convert the binary 2AFC judgments produced by VLMs into an optimizable perceptual metric which can be exploited by existing GAN-based perceptually oriented compression systems.
— VLIC: Vision-Language Models As Perceptual Judges for Human-Aligned Image Compression
(2512.15701 - Sargent et al., 17 Dec 2025) in Section 1 (Introduction)