Unsupervised Cross-Domain Image Retrieval via Prototypical Optimal Transport (2402.18411v4)
Abstract: Unsupervised cross-domain image retrieval (UCIR) aims to retrieve images sharing the same category across diverse domains without relying on labeled data. Prior approaches have typically decomposed the UCIR problem into two distinct tasks: intra-domain representation learning and cross-domain feature alignment. However, these segregated strategies overlook the potential synergies between these tasks. This paper introduces ProtoOT, a novel Optimal Transport formulation explicitly tailored for UCIR, which integrates intra-domain feature representation learning and cross-domain alignment into a unified framework. ProtoOT leverages the strengths of the K-means clustering method to effectively manage distribution imbalances inherent in UCIR. By utilizing K-means for generating initial prototypes and approximating class marginal distributions, we modify the constraints in Optimal Transport accordingly, significantly enhancing its performance in UCIR scenarios. Furthermore, we incorporate contrastive learning into the ProtoOT framework to further improve representation learning. This encourages local semantic consistency among features with similar semantics, while also explicitly enforcing separation between features and unmatched prototypes, thereby enhancing global discriminativeness. ProtoOT surpasses existing state-of-the-art methods by a notable margin across benchmark datasets. Notably, on DomainNet, ProtoOT achieves an average P@200 enhancement of 18.17%, and on Office-Home, it demonstrates a P@15 improvement of 3.83%.
- Self-labelling via simultaneous clustering and representation learning. In International Conference on Learning Representations.
- MMFL-net: multi-scale and multi-granularity feature learning for cross-domain fashion retrieval. Multimedia Tools and Applications, 1–33.
- Unsupervised learning of visual features by contrasting cluster assignments. Advances in neural information processing systems, 33: 9912–9924.
- Unified optimal transport framework for universal domain adaptation. Advances in Neural Information Processing Systems, 35: 29512–29524.
- CSOT: Curriculum and Structure-Aware Optimal Transport for Learning with Noisy Labels. arXiv preprint arXiv:2312.06221.
- Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297.
- Joint distribution optimal transportation for domain adaptation. Advances in neural information processing systems, 30.
- Domain adaptation with regularized optimal transport. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2014, Nancy, France, September 15-19, 2014. Proceedings, Part I 14, 274–289. Springer.
- Cuturi, M. 2013. Sinkhorn distances: Lightspeed computation of optimal transport. Advances in neural information processing systems, 26.
- Adversarial training based multi-source unsupervised domain adaptation for sentiment analysis. In Proceedings of the AAAI conference on artificial intelligence, volume 34, 7618–7625.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, 248–255. Ieee.
- Figalli, A. 2010. The optimal partial transport problem. Archive for rational mechanics and analysis, 195(2): 533–560.
- Optimal transport for domain adaptation. IEEE Trans. Pattern Anal. Mach. Intell, 1: 1–40.
- Cross-domain fashion image retrieval. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 1869–1871.
- A kernel two-sample test. The Journal of Machine Learning Research, 13(1): 723–773.
- Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 9729–9738.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778.
- Feature representation learning for unsupervised cross-domain image retrieval. In European Conference on Computer Vision, 529–544. Springer.
- Cross-domain image retrieval with a dual attribute-aware ranking network. In Proceedings of the IEEE international conference on computer vision, 1062–1070.
- Cross-domain image retrieval with attention modeling. In Proceedings of the 25th ACM international conference on Multimedia, 1654–1662.
- Cds: Cross-domain self-supervised pre-training. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 9123–9132.
- Cross-domain image-based 3d shape retrieval by view sequence learning. In 2018 international conference on 3D vision (3DV), 258–266. IEEE.
- Unsupervised domain adaptation via softmax-based prototype construction and adaptation. Information Sciences, 609: 257–275.
- Prototypical Contrastive Learning of Unsupervised Representations. In International Conference on Learning Representations.
- Prototype-guided continual adaptation for class-incremental unsupervised domain adaptation. In European Conference on Computer Vision, 351–368. Springer.
- Deep transfer learning with joint adaptation networks. In International conference on machine learning, 2208–2217. PMLR.
- Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748.
- Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32.
- Moment matching for multi-source domain adaptation. In Proceedings of the IEEE/CVF international conference on computer vision, 1406–1415.
- Optimal transport for multi-source domain adaptation under target shift. In The 22nd International Conference on artificial intelligence and statistics, 849–858. PMLR.
- Stylemeup: Towards style-agnostic sketch-based image retrieval. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 8504–8513.
- Bending graphs: Hierarchical shape matching using gated optimal transport. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11757–11767.
- Unbalanced Optimal Transport, from theory to numerics. Handbook of Numerical Analysis, 24: 407–471.
- Accurate point cloud registration with robust optimal transport. Advances in Neural Information Processing Systems, 34: 5373–5389.
- Deep hashing network for unsupervised domain adaptation. In Proceedings of the IEEE conference on computer vision and pattern recognition, 5018–5027.
- Villani, C.; et al. 2009. Optimal transport: old and new, volume 338. Springer.
- Correspondence-Free Domain Alignment for Unsupervised Cross-Domain Image Retrieval. In The Thirty-Seventh AAAI Conference on Artificial Intelligence, AAAI 2023.
- Unsupervised feature learning via non-parametric instance discrimination. In Proceedings of the IEEE conference on computer vision and pattern recognition, 3733–3742.
- Prototypical cross-domain self-supervised learning for few-shot unsupervised domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 13834–13844.
- Unbalanced feature transport for exemplar-based image translation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 15028–15038.
- Collaborative and adversarial network for unsupervised domain adaptation. In Proceedings of the IEEE conference on computer vision and pattern recognition, 3801–3809.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.