Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
175 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Unsupervised Multiple Domain Translation through Controlled Disentanglement in Variational Autoencoder (2401.09180v2)

Published 17 Jan 2024 in cs.LG and cs.CV

Abstract: Unsupervised Multiple Domain Translation is the task of transforming data from one domain to other domains without having paired data to train the systems. Typically, methods based on Generative Adversarial Networks (GANs) are used to address this task. However, our proposal exclusively relies on a modified version of a Variational Autoencoder. This modification consists of the use of two latent variables disentangled in a controlled way by design. One of this latent variables is imposed to depend exclusively on the domain, while the other one must depend on the rest of the variability factors of the data. Additionally, the conditions imposed over the domain latent variable allow for better control and understanding of the latent space. We empirically demonstrate that our approach works on different vision datasets improving the performance of other well known methods. Finally, we prove that, indeed, one of the latent variables stores all the information related to the domain and the other one hardly contains any domain information.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. “Generative adversarial nets,” Advances in neural information processing systems, vol. 27, pp. 2672–2680, 2014.
  2. “Disentangling factors of variation in deep representation using adversarial training,” Advances in neural information processing systems, vol. 29, 2016.
  3. “Image-to-image translation with conditional adversarial networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
  4. “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017.
  5. “Dualgan: Unsupervised dual learning for image-to-image translation,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2849–2857.
  6. “Unsupervised image-to-image translation networks,” in Advances in Neural Information Processing Systems (NeurIPS), 2017.
  7. “Multimodal unsupervised image-to-image translation,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018.
  8. “Image-to-image translation for cross-domain disentanglement,” Advances in neural information processing systems, vol. 31, 2018.
  9. “Stargan: Unified generative adversarial networks for multi-domain image-to-image translation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
  10. “Unsupervised multi-domain image translation with domain-specific encoders/decoders,” in 2018 24th International Conference on Pattern Recognition (ICPR). IEEE, 2018, pp. 2044–2049.
  11. “Stargan v2: Diverse image synthesis for multiple domains,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
  12. “Image-to-image translation: Methods and applications,” IEEE Transactions on Multimedia, vol. 24, pp. 3859–3881, 2021.
  13. “Auto-encoding variational bayes,” arXiv preprint arXiv:1312.6114, 2013.
  14. “Conditional generative adversarial nets,” arXiv preprint arXiv:1411.1784, 2014.
  15. “Understanding disentangling in β𝛽\betaitalic_β-vae,” in Proceedings of the International Conference on Learning Representations (ICLR), 2017.
  16. “Disentangling by factorising,” in Proceedings of the International Conference on Machine Learning (ICML), 2018.
  17. “Variational inference of disentangled latent concepts from unlabeled observations,” in Proceedings of the International Conference on Machine Learning (ICML), 2018.
  18. “Guided variational autoencoder for disentanglement learning,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 7920–7929.
  19. “Semi-supervised learning with deep generative models,” Advances in Neural Information Processing Systems, 2014.
  20. “Disentangling by factorising,” in International Conference on Machine Learning. PMLR, 2018, pp. 2649–2658.
  21. “Disentangling and learning robust representations with natural clustering,” in 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA). IEEE, 2019, pp. 694–699.
  22. “Learning structured output representation using deep conditional generative models,” Advances in neural information processing systems, vol. 28, 2015.
  23. “Learning discourse-level diversity for neural dialog models using conditional variational autoencoders,” arXiv preprint arXiv:1703.10960, 2017.
  24. “Molecular generative model based on conditional variational autoencoder for de novo molecular design,” Journal of cheminformatics, vol. 10, no. 1, pp. 1–9, 2018.
  25. “Anomaly detection with conditional variational autoencoders,” in 2019 18th IEEE international conference on machine learning and applications (ICMLA). IEEE, 2019, pp. 1651–1657.
  26. “Action-conditioned 3d human motion synthesis with transformer vae,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 10985–10995.
  27. “The mnist database of handwritten digits,” IEEE Transactions on Neural Networks, vol. 8, no. 1, pp. 237–257, 1998.
  28. “The street view house numbers (svhn) dataset,” arXiv preprint arXiv:1202.2745, 2013.
  29. “Deep visual analogy-making,” in NIPS, 2015.
Citations (1)

Summary

We haven't generated a summary for this paper yet.