Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Patch-wise Graph Contrastive Learning for Image Translation (2312.08223v2)

Published 13 Dec 2023 in cs.CV

Abstract: Recently, patch-wise contrastive learning is drawing attention for the image translation by exploring the semantic correspondence between the input and output images. To further explore the patch-wise topology for high-level semantic understanding, here we exploit the graph neural network to capture the topology-aware features. Specifically, we construct the graph based on the patch-wise similarity from a pretrained encoder, whose adjacency matrix is shared to enhance the consistency of patch-wise relation between the input and the output. Then, we obtain the node feature from the graph neural network, and enhance the correspondence between the nodes by increasing mutual information using the contrastive loss. In order to capture the hierarchical semantic structure, we further propose the graph pooling. Experimental results demonstrate the state-of-art results for the image translation thanks to the semantic encoding by the constructed graphs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. Text2LIVE: Text-Driven Layered Image and Video Editing. In Avidan, S.; Brostow, G.; Cissé, M.; Farinella, G. M.; and Hassner, T., eds., Computer Vision – ECCV 2022, 707–723. Cham: Springer Nature Switzerland. ISBN 978-3-031-19784-0.
  2. One-Sided Unsupervised Domain Mapping. In Guyon, I.; Luxburg, U. V.; Bengio, S.; Wallach, H.; Fergus, R.; Vishwanathan, S.; and Garnett, R., eds., Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc.
  3. Topology Adaptive Graph Convolutional Networks.
  4. Geometry-Consistent Generative Adversarial Networks for One-Sided Unsupervised Domain Mapping. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
  5. Graph U-Nets. In Chaudhuri, K.; and Salakhutdinov, R., eds., Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, 2083–2092. PMLR.
  6. Vision GNN: An Image is Worth Graph of Nodes. In Oh, A. H.; Agarwal, A.; Belgrave, D.; and Cho, K., eds., Advances in Neural Information Processing Systems.
  7. QS-Attn: Query-Selected Attention for Contrastive Learning in I2I Translation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 18291–18300.
  8. Multimodal Unsupervised Image-to-image Translation. In Proceedings of the European Conference on Computer Vision (ECCV).
  9. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, 1125–1134.
  10. Exploring Patch-Wise Semantic Relation for Contrastive Learning in Image-to-Image Translation Tasks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 18260–18269.
  11. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations.
  12. Style Transfer by Relaxed Optimal Transport and Self-Similarity. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
  13. Deep geometric knowledge distillation with graphs. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 8484–8488. IEEE.
  14. Least squares generative adversarial networks. In Proceedings of the IEEE international conference on computer vision, 2794–2802.
  15. Deep Spectral Methods: A Surprisingly Strong Baseline for Unsupervised Semantic Segmentation and Localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 8364–8375.
  16. Representation Learning with Contrastive Predictive Coding.
  17. BAM: Bottleneck Attention Module. In British Machine Vision Conference 2018, BMVC 2018, Newcastle, UK, September 3-6, 2018, 147. BMVA Press.
  18. Contrastive Learning for Unpaired Image-to-Image Translation. In Vedaldi, A.; Bischof, H.; Brox, T.; and Frahm, J.-M., eds., Computer Vision – ECCV 2020, 319–345. Cham: Springer International Publishing. ISBN 978-3-030-58545-7.
  19. SuperGlue: Learning Feature Matching With Graph Neural Networks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
  20. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
  21. Splicing ViT Features for Semantic Appearance Transfer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 10748–10757.
  22. Instance-Wise Hard Negative Example Generation for Contrastive Learning in Unpaired Image-to-Image Translation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 14020–14029.
  23. Self-Supervised Transformers for Unsupervised Object Discovery Using Normalized Cut. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 14543–14553.
  24. CBAM: Convolutional Block Attention Module. In Proceedings of the European Conference on Computer Vision (ECCV).
  25. Structural Entropy Guided Graph Hierarchical Pooling. In Chaudhuri, K.; Jegelka, S.; Song, L.; Szepesvari, C.; Niu, G.; and Sabato, S., eds., Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, 24017–24030. PMLR.
  26. Self-Supervised Representation Learning via Latent Graph Prediction. In Chaudhuri, K.; Jegelka, S.; Song, L.; Szepesvari, C.; Niu, G.; and Sabato, S., eds., Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, 24460–24477. PMLR.
  27. Photorealistic Style Transfer via Wavelet Transforms. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV).
  28. StructPool: Structured Graph Pooling via Conditional Random Fields. In International Conference on Learning Representations.
  29. Marginal Contrastive Correspondence for Guided Image Generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 10663–10672.
  30. Modulated Contrast for Versatile Image Synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 18280–18290.
  31. Semantic2Graph: Graph-based Multi-modal Feature for Action Segmentation in Videos.
  32. The Spatially-Correlative Loss for Various Image Translation Tasks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 16407–16417.
  33. Distilling Holistic Knowledge With Graph Neural Networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 10387–10396.
  34. Unpaired Image-To-Image Translation Using Cycle-Consistent Adversarial Networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV).
Citations (2)

Summary

We haven't generated a summary for this paper yet.