Domain Generalization Guided by Gradient Signal to Noise Ratio of Parameters (2310.07361v1)
Abstract: Overfitting to the source domain is a common issue in gradient-based training of deep neural networks. To compensate for the over-parameterized models, numerous regularization techniques have been introduced such as those based on dropout. While these methods achieve significant improvements on classical benchmarks such as ImageNet, their performance diminishes with the introduction of domain shift in the test set i.e. when the unseen data comes from a significantly different distribution. In this paper, we move away from the classical approach of Bernoulli sampled dropout mask construction and propose to base the selection on gradient-signal-to-noise ratio (GSNR) of network's parameters. Specifically, at each training step, parameters with high GSNR will be discarded. Furthermore, we alleviate the burden of manually searching for the optimal dropout ratio by leveraging a meta-learning approach. We evaluate our method on standard domain generalization benchmarks and achieve competitive results on classification and face anti-spoofing problems.
- Nas-ood: Neural architecture search for out-of-distribution generalization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8320–8329, 2021.
- Oulu-npu: A mobile face presentation attack database with real-world variations. In 2017 12th IEEE international conference on automatic face & gesture recognition (FG 2017), pages 612–618. IEEE, 2017.
- Domain generalization by solving jigsaw puzzles. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2229–2238, 2019.
- Learning to balance specificity and invariance for in and out of domain generalization. In European Conference on Computer Vision, pages 301–318. Springer, 2020.
- Generalizable representation learning for mixture domain face anti-spoofing. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35 of 2, pages 1132–1139, 2021.
- On the effectiveness of local binary patterns in face anti-spoofing. In 2012 BIOSIG-proceedings of the international conference of biometrics special interest group (BIOSIG), pages 1–7. IEEE, 2012.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
- Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552, 2017.
- Cross-domain similarity learning for face recognition in unseen domains. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15292–15301, 2021.
- Model-agnostic meta-learning for fast adaptation of deep networks. In International Conference on Machine Learning, pages 1126–1135. PMLR, 2017.
- Stiffness: A new perspective on generalization in neural networks. arXiv preprint arXiv:1901.09491, 2019.
- Domain-adversarial training of neural networks. The journal of machine learning research, 17(1):2096–2030, 2016.
- Loss function learning for domain generalization by implicit gradient. In International Conference on Machine Learning, pages 7002–7016. PMLR, 2022.
- Dropblock: A regularization method for convolutional networks. Advances in neural information processing systems, 31, 2018.
- Domain generalization via progressive layer-wise and channel-wise dropout. arXiv preprint arXiv:2112.03676, 2021.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- Batchformer: Learning to explore sample relationships for robust representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7256–7266, 2022.
- The two dimensions of worst-case training and their integrated effect for out-of-domain generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9631–9641, 2022.
- Self-challenging improves cross-domain generalization. In European Conference on Computer Vision, pages 124–140. Springer, 2020.
- Single-side domain generalization for face anti-spoofing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8484–8493, 2020.
- Style neophile: Constantly seeking novel styles for domain generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7130–7140, 2022.
- Self-normalizing neural networks. Advances in neural information processing systems, 30, 2017.
- Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 2012.
- Fractalnet: Ultra-deep neural networks without residuals. arXiv preprint arXiv:1605.07648, 2016.
- Feature alignment by uncertainty and self-training for source-free unsupervised domain adaptation. Neural Networks, 161:682–692, 2023.
- Deeper, broader and artier domain generalization. In Proceedings of the IEEE international conference on computer vision, pages 5542–5550, 2017.
- Domain generalization with adversarial feature learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5400–5409, 2018.
- Understanding generalization in deep learning via tensor methods. In International Conference on Artificial Intelligence and Statistics, pages 504–515. PMLR, 2020.
- A simple feature augmentation for domain generalization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8886–8895, 2021.
- Meta-sgd: Learning to learn quickly for few-shot learning. arXiv preprint arXiv:1707.09835, 2017.
- Learning to learn across diverse data biases in deep face recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4072–4082, 2022.
- Understanding why neural networks generalize well through gsnr of parameters. arXiv preprint arXiv:2001.07384, 2020.
- Adaptive normalized representation learning for generalizable face anti-spoofing. In Proceedings of the 29th ACM International Conference on Multimedia, pages 1469–1477, 2021.
- Dual reweighting domain generalization for face presentation attack detection. In IJCAI, 2021.
- Domain generalization using a mixture of multiple latent domains. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34 of 07, pages 11749–11756, 2020.
- Unified deep supervised domain adaptation and generalization. In Proceedings of the IEEE international conference on computer vision, pages 5715–5725, 2017.
- Domain generalization via invariant feature representation. In International Conference on Machine Learning, pages 10–18. PMLR, 2013.
- Reducing domain gap by reducing style bias. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8690–8699, 2021.
- Unsupervised learning of visual representations by solving jigsaw puzzles. In European conference on computer vision, pages 69–84. Springer, 2016.
- Moment matching for multi-source domain adaptation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1406–1415, 2019.
- On the spectral bias of neural networks. In International Conference on Machine Learning, pages 5301–5310. PMLR, 2019.
- Semi-supervised domain adaptation via minimax entropy. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8050–8058, 2019.
- Maximum classifier discrepancy for unsupervised domain adaptation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3723–3732, 2018.
- Multi-source domain adaptation via supervised contrastive learning and confident consistency regularization. arXiv preprint arXiv:2106.16093, 2021.
- Learning to optimize domain specific normalization for domain generalization. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXII 16, pages 68–83. Springer, 2020.
- Generalizing across domains via cross-gradient training. In International Conference on Learning Representations, 2018.
- Multi-adversarial discriminative deep domain generalization for face presentation attack detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10023–10031, 2019.
- Regularized fine-grained meta face anti-spoofing. In AAAI, volume 34 of 07, pages 11974–11981, 2020.
- Open domain generalization with domain-augmented meta-learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9624–9633, 2021.
- On generalizing beyond domains in cross-domain continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9265–9274, 2022.
- Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1):1929–1958, 2014.
- Efficient object localization using convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 648–656, 2015.
- Vladimir Vapnik. The nature of statistical learning theory. Springer science & business media, 1999.
- Deep hashing network for unsupervised domain adaptation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5018–5027, 2017.
- Manifold mixup: Better representations by interpolating hidden states. In International conference on machine learning, pages 6438–6447. PMLR, 2019.
- Cross-domain face presentation attack detection via multi-domain disentangled representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6678–6687, 2020.
- Self-domain adaptation for face anti-spoofing. In Proceedings of the AAAI Conference on Artificial Intelligence, number 4 in 35, pages 2746–2754, 2021.
- Learning to diversify for single domain generalization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 834–843, 2021.
- Domain generalization via shuffled style assembly for face anti-spoofing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4123–4133, 2022.
- Consistency regularization for deep face anti-spoofing. IEEE Transactions on Information Forensics and Security, 18:1127–1140, 2023.
- Face spoof detection with image distortion analysis. IEEE Transactions on Information Forensics and Security, 10(4):746–761, 2015.
- Deep cocktail network: Multi-source unsupervised domain adaptation with category shift. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3964–3973, 2018.
- Exploiting low-rank structure from latent domains for domain generalization. In European Conference on Computer Vision, pages 628–643. Springer, 2014.
- Robust and generalizable visual representation learning via random convolutions. arXiv preprint arXiv:2007.13003, 2020.
- Cutmix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE/CVF international conference on computer vision, pages 6023–6032, 2019.
- Understanding deep learning (still) requires rethinking generalization. Communications of the ACM, 64(3):107–115, 2021.
- mixup: Beyond empirical risk minimization. In International Conference on Learning Representations, 2018.
- Adaptive risk minimization: A meta-learning approach for tackling group shift. CoRR, abs/2007.02931, 2020.
- Deep stable learning for out-of-distribution generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5372–5382, 2021.
- A face antispoofing database with diverse attacks. In 2012 5th IAPR international conference on Biometrics (ICB), pages 26–31. IEEE, 2012.
- Learning to generate novel domains for domain generalization. In European Conference on Computer Vision, pages 561–578. Springer, 2020.
- Domain generalization with mixstyle. In International Conference on Learning Representations, 2020.
- Domain adaptive ensemble learning. IEEE Transactions on Image Processing, 30:8008–8018, 2021.
- Domain generalization with mixstyle. In International Conference on Learning Representations, 2021.
- Domain generalization with mixstyle. arXiv preprint arXiv:2104.02008, 2021.