Mask CycleGAN: Unpaired Multi-modal Domain Translation with Interpretable Latent Variable

Published 14 May 2022 in cs.LG | (2205.06969v1)

Abstract: We propose Mask CycleGAN, a novel architecture for unpaired image domain translation built based on CycleGAN, with an aim to address two issues: 1) unimodality in image translation and 2) lack of interpretability of latent variables. Our innovation in the technical approach is comprised of three key components: masking scheme, generator and objective. Experimental results demonstrate that this architecture is capable of bringing variations to generated images in a controllable manner and is reasonably robust to different masks.