Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Causal Inspired Early-Branching Structure for Domain Generalization (2403.08649v1)

Published 13 Mar 2024 in cs.CV

Abstract: Learning domain-invariant semantic representations is crucial for achieving domain generalization (DG), where a model is required to perform well on unseen target domains. One critical challenge is that standard training often results in entangled semantic and domain-specific features. Previous works suggest formulating the problem from a causal perspective and solving the entanglement problem by enforcing marginal independence between the causal (\ie semantic) and non-causal (\ie domain-specific) features. Despite its simplicity, the basic marginal independent-based idea alone may be insufficient to identify the causal feature. By d-separation, we observe that the causal feature can be further characterized by being independent of the domain conditioned on the object, and we propose the following two strategies as complements for the basic framework. First, the observation implicitly implies that for the same object, the causal feature should not be associated with the non-causal feature, revealing that the common practice of obtaining the two features with a shared base feature extractor and two lightweight prediction heads might be inappropriate. To meet the constraint, we propose a simple early-branching structure, where the causal and non-causal feature obtaining branches share the first few blocks while diverging thereafter, for better structure design; Second, the observation implies that the causal feature remains invariant across different domains for the same object. To this end, we suggest that augmentation should be incorporated into the framework to better characterize the causal feature, and we further suggest an effective random domain sampling scheme to fulfill the task. Theoretical and experimental results show that the two strategies are beneficial for the basic marginal independent-based framework. Code is available at \url{https://github.com/liangchen527/CausEB}.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (98)
  1. Generalizing to unseen domains via distribution matching. arXiv preprint arXiv:1911.00804, 2019.
  2. Invariant risk minimization. arXiv preprint arXiv:1907.02893, 2019.
  3. A causal view of compositional zero-shot recognition. In NeurIPS, 2020.
  4. Metareg: Towards domain generalization using meta-regularization. In NeurIPS, 2018.
  5. From detection of individual metastases to classification of lymph node status at the patient level: the camelyon17 challenge. IEEE TMI, 38(2):550–560, 2018.
  6. The iwildcam 2021 competition dataset. arXiv preprint arXiv:2105.03494, 2021.
  7. Recognition in terra incognita. In ECCV, 2018.
  8. Analysis of representations for domain adaptation. In NeurIPS, 2006.
  9. Domain generalization by marginal transfer learning. JMLR, 22(1):46–100, 2021.
  10. Hallucinating agnostic images to generalize across domains. In ICCVW, 2019.
  11. Swad: Domain generalization by seeking flat minima. NeurIPS, 2021.
  12. Domain generalization by mutual-information regularization with pre-trained models. In ECCV, 2022.
  13. Compound domain generalization via meta-knowledge encoding. In CVPR, 2022.
  14. Mix and reason: Reasoning over semantic topology with data mixing for domain generalization. In NeurIPS, 2022.
  15. Self-supervised learning of adversarial example: Towards good generalizations for deepfake detection. In CVPR, 2022.
  16. Improved test-time adaptation for domain generalization. In CVPR, 2023.
  17. Domain generalization via rationale invariance. In ICCV, 2023.
  18. Ost: Improving generalization of deepfake detection via one-shot test-time training. In NeurIPS, 2022.
  19. A style and semantic memory mechanism for domain generalization. In ICCV, 2021.
  20. A causal framework for distribution generalization. IEEE TPAMI, 44(10):6614–6630, 2021.
  21. Functional map of the world. In CVPR, 2018.
  22. Domain generalization via model-agnostic learning of semantic features. In NeurIPS, 2019.
  23. The pascal visual object classes (voc) challenge. IJCV, 88(2):303–338, 2010.
  24. Unbiased metric learning: On the utilization of multiple datasets and web images for softening bias. In ICCV, 2013.
  25. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In CVPR worksho, 2004.
  26. Domain-adversarial training of neural networks. JMLR, 17(1):2096–2030, 2016.
  27. Scatter component analysis: A unified framework for domain adaptation and domain generalization. IEEE TPAMI, 39(7):1414–1430, 2016.
  28. Domain generalization for object recognition with multi-task autoencoders. In ICCV, 2015.
  29. Domain adaptation with conditional transferable components. In ICML, 2016.
  30. Measuring statistical dependence with hilbert-schmidt norms. In ALT, 2005.
  31. A kernel statistical test of independence. In NeurIPS, 2007.
  32. Bootstrap your own latent-a new approach to self-supervised learning. NeurIPS, 2020.
  33. In search of lost domain generalization. In ICLR, 2021.
  34. Hidden markov nonlinear ica: Unsupervised learning from nonstationary time series. In UAI, 2020.
  35. Unsupervised domain generalization by learning a bridge across domains. In CVPR, 2022.
  36. Deep residual learning for image recognition. In CVPR, 2016.
  37. Conditional variance penalties and domain shift robustness. Machine Learning, 110(2):303–348, 2021.
  38. Domain generalization via multidomain discriminant analysis. In UAI, 2020.
  39. Densely connected convolutional networks. In CVPR, 2017.
  40. Arbitrary style transfer in real-time with adaptive instance normalization. In ICCV, 2017.
  41. Self-challenging improves cross-domain generalization. In ECCV, 2020.
  42. Unsupervised feature extraction by time-contrastive learning and nonlinear ica. NeurIPS, 2016.
  43. Highly accurate protein structure prediction with alphafold. Nature, 596(7873):583–589, 2021.
  44. Style neophile: Constantly seeking novel styles for domain generalization. In CVPR, 2022.
  45. Undoing the damage of dataset bias. In ECCV, 2012.
  46. Selfreg: Self-supervised contrastive regularization for domain generalization. In ICCV, 2021.
  47. Wilds: A benchmark of in-the-wild distribution shifts. In ICML, 2021.
  48. When is invariance useful in an out-of-distribution generalization problem? arXiv preprint arXiv:2008.01883, 2020.
  49. Out-of-distribution generalization via risk extrapolation (rex). In ICML, 2021.
  50. Deeper, broader and artier domain generalization. In ICCV, 2017.
  51. Learning to generalize: Meta-learning for domain generalization. In AAAI, 2018.
  52. Episodic training for domain generalization. In ICCV, 2019.
  53. Domain generalization with adversarial feature learning. In CVPR, 2018.
  54. A simple feature augmentation for domain generalization. In ICCV, 2021.
  55. Uncertainty modeling for out-of-distribution generalization. In ICLR, 2022.
  56. Deep domain generalization via conditional invariant adversarial networks. In ECCV, 2018.
  57. Learning causal semantic representation for out-of-distribution prediction. In NeurIPS, 2021.
  58. Domain adaptation by using causal inference to predict invariant conditional distributions. NeurIPS, 2018.
  59. Domain generalization using causal matching. In ICML, 2021.
  60. Domain generalization via invariant feature representation. In ICML, 2013.
  61. Reducing domain gap by reducing style bias. In CVPR, 2021.
  62. Generalization on unseen domains via inference-time label-preserving target projections. In CVPR, 2021.
  63. Judea Pearl. Causality. Cambridge university press, 2009.
  64. Moment matching for multi-source domain adaptation. In ICCV, 2019.
  65. Causal inference by using invariant prediction: identification and confidence intervals. Journal of the Royal Statistical Society. Series B (Statistical Methodology), pages 947–1012, 2016.
  66. Elements of causal inference: foundations and learning algorithms. The MIT Press, 2017.
  67. Gradient starvation: A learning proclivity in neural networks. In NeurIPS, 2021.
  68. Focus on the common good: Group distributional robustness follows. In ICLR, 2022.
  69. Fishr: Invariant gradient variances for out-of-distribution generalization. In ICML, 2022.
  70. Invariant models for causal transfer learning. JMLR, 19(1):1309–1342, 2018.
  71. Optimal representations for covariate shift. In ICLR, 2022.
  72. Labelme: a database and web-based tool for image annotation. IJCV, 77(1):157–173, 2008.
  73. Distributionally robust neural networks for group shifts: On the importance of regularization for worst-case generalization. In ICLR, 2020.
  74. Improving robustness against common corruptions by covariate shift adaptation. NeurIPS, 2020.
  75. On causal and anticausal learning. arXiv preprint arXiv:1206.6471, 2012.
  76. Gradient matching for domain generalization. In ICLR, 2021.
  77. Mastering the game of go with deep neural networks and tree search. nature, 529(7587):484–489, 2016.
  78. Deep coral: Correlation alignment for deep domain adaptation. In ECCV, 2016.
  79. Rxrx1: An image set for cellular morphological variation across many experimental batches. In ICLRW, 2019.
  80. Training data-efficient image transformers & distillation through attention. In ICML, 2021.
  81. Vladimir Vapnik. The nature of statistical learning theory. Springer science & business media, 1999.
  82. Deep hashing network for unsupervised domain adaptation. In CVPR, 2017.
  83. Self-supervised learning with data augmentations provably isolates content from style. NeurIPS, 2021.
  84. Out-of-distribution generalization with causal invariant transformations. In CVPR, 2022.
  85. Causal balancing for domain generalization. In ICLR, 2023.
  86. Sun database: Large-scale scene recognition from abbey to zoo. In CVPR, 2010.
  87. A fourier-based framework for domain generalization. In CVPR, 2021.
  88. Robust and generalizable visual representation learning via random convolutions. In ICLR, 2021.
  89. Improve unsupervised domain adaptation with mixup training. arXiv preprint arXiv:2001.00677, 2020.
  90. Adversarial teacher-student representation learning for domain generalization. In NeurIPS, 2021.
  91. Improving out-of-distribution robustness via selective augmentation. In ICML, 2022.
  92. Tokens-to-token vit: Training vision transformers from scratch on imagenet. In ICCV, 2021.
  93. Barlow twins: Self-supervised learning via redundancy reduction. In ICML, 2021.
  94. mixup: Beyond empirical risk minimization. In ICLR, 2018.
  95. Domain adaptation under target and conditional shift. In ICML, 2013.
  96. Adaptive risk minimization: A meta-learning approach for tackling group distribution shift. arXiv preprint arXiv:2007.02931, 2020.
  97. Learning to generate novel domains for domain generalization. In ECCV, 2020.
  98. Domain generalization with mixstyle. In ICLR, 2021.
Citations (1)

Summary

We haven't generated a summary for this paper yet.