Diagnosing and Rectifying Fake OOD Invariance: A Restructured Causal Approach (2312.09758v1)
Abstract: Invariant representation learning (IRL) encourages the prediction from invariant causal features to labels de-confounded from the environments, advancing the technical roadmap of out-of-distribution (OOD) generalization. Despite spotlights around, recent theoretical results verified that some causal features recovered by IRLs merely pretend domain-invariantly in the training environments but fail in unseen domains. The \emph{fake invariance} severely endangers OOD generalization since the trustful objective can not be diagnosed and existing causal surgeries are invalid to rectify. In this paper, we review a IRL family (InvRat) under the Partially and Fully Informative Invariant Feature Structural Causal Models (PIIF SCM /FIIF SCM) respectively, to certify their weaknesses in representing fake invariant features, then, unify their causal diagrams to propose ReStructured SCM (RS-SCM). RS-SCM can ideally rebuild the spurious and the fake invariant features simultaneously. Given this, we further develop an approach based on conditional mutual information with respect to RS-SCM, then rigorously rectify the spurious and fake invariant effects. It can be easily implemented by a small feature selection subnet introduced in the IRL family, which is alternatively optimized to achieve our goal. Experiments verified the superiority of our approach to fight against the fake invariant issue across a variety of OOD generalization benchmarks.
- Systematic generalisation with group invariant predictions. In International Conference on Learning Representations.
- Invariance principle meets information bottleneck for out-of-distribution generalization. Advances in Neural Information Processing Systems, 34: 3438–3450.
- Invariant risk minimization games. In International Conference on Machine Learning, 145–155. PMLR.
- Deep Variational Information Bottleneck. In International Conference on Learning Representations.
- Invariant risk minimization. arXiv preprint arXiv:1907.02893.
- Mutual information neural estimation. In International conference on machine learning, 531–540. PMLR.
- Invariant rationalization. In International Conference on Machine Learning, 1448–1458. PMLR.
- Meta-causal Learning for Single Domain Generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7683–7692.
- Domain adaptation under structural causal models. The Journal of Machine Learning Research, 22(1): 11856–11935.
- Invariant causal mechanisms through distribution matching. arXiv preprint arXiv:2206.11646.
- DNA: Domain generalization with diversified neural averaging. In International Conference on Machine Learning, 4010–4034. PMLR.
- Domain-adversarial training of neural networks. The journal of machine learning research, 17(1): 2096–2030.
- In search of lost domain generalization. arXiv preprint arXiv:2007.01434.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778.
- The Missing Invariance Principle found–the Reciprocal Twin of Invariant Risk Minimization. Advances in Neural Information Processing Systems, 35: 23023–23035.
- Transportable Representations for Out-of-distribution Generalization.
- Invariant and transportable representations for anti-causal domain shifts. Advances in Neural Information Processing Systems, 35: 20782–20794.
- Does invariant risk minimization capture invariance? In International Conference on Artificial Intelligence and Statistics, 4069–4077. PMLR.
- Domain randomization for scene-specific car detection and pose estimation. In 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), 1932–1940. IEEE.
- Out-of-distribution generalization via risk extrapolation (rex). In International Conference on Machine Learning, 5815–5826. PMLR.
- Stable prediction across unknown environments. In proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, 1617–1626.
- Set transformer: A framework for attention-based permutation-invariant neural networks. In International conference on machine learning, 3744–3753. PMLR.
- Cross-domain ensemble distillation for domain generalization. In European Conference on Computer Vision, 1–20. Springer.
- Invariant information bottleneck for domain generalization. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, 7399–7407.
- ZIN: When and How to Learn Invariance Without Environment Partition? Advances in Neural Information Processing Systems, 35: 24529–24542.
- Causality inspired representation learning for domain generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8046–8056.
- Domain generalization using causal matching. In International Conference on Machine Learning, 7313–7324. PMLR.
- Understanding the failure modes of out-of-distribution generalization. In International Conference on Learning Representations.
- Pearl, J. 2009. Causal inference in statistics: An overview.
- Pearl, J. 2010. Causal inference. Causality: objectives and assessment, 39–58.
- Causal inference by using invariant prediction: identification and confidence intervals. Journal of the Royal Statistical Society Series B: Statistical Methodology, 78(5): 947–1012.
- The Risks of Invariant Risk Minimization. In International Conference on Learning Representations.
- Towards out-of-distribution generalization: A survey. arXiv preprint arXiv:2108.13624.
- Open domain generalization with domain-augmented meta-learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 9624–9633.
- Generalizing to unseen domains: A survey on domain generalization. IEEE Transactions on Knowledge and Data Engineering.
- Out-of-distribution generalization with causal invariant transformations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 375–385.
- Heterogeneous domain generalization via domain mixup. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 3622–3626. IEEE.
- On Heterogeneous Treatment Effects in Heterogeneous Causal Graphs. arXiv preprint arXiv:2301.12383.
- Distribution Shift Inversion for Out-of-Distribution Prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3592–3602.
- On learning invariant representations for domain adaptation. In International conference on machine learning, 7523–7532. PMLR.
- Sparse invariant risk minimization. In International Conference on Machine Learning, 27222–27244. PMLR.