Papers
Topics
Authors
Recent
Search
2000 character limit reached

Mitigating the Bias in the Model for Continual Test-Time Adaptation

Published 2 Mar 2024 in cs.LG and cs.CV | (2403.01344v1)

Abstract: Continual Test-Time Adaptation (CTA) is a challenging task that aims to adapt a source pre-trained model to continually changing target domains. In the CTA setting, a model does not know when the target domain changes, thus facing a drastic change in the distribution of streaming inputs during the test-time. The key challenge is to keep adapting the model to the continually changing target domains in an online manner. We find that a model shows highly biased predictions as it constantly adapts to the chaining distribution of the target data. It predicts certain classes more often than other classes, making inaccurate over-confident predictions. This paper mitigates this issue to improve performance in the CTA scenario. To alleviate the bias issue, we make class-wise exponential moving average target prototypes with reliable target samples and exploit them to cluster the target features class-wisely. Moreover, we aim to align the target distributions to the source distribution by anchoring the target feature to its corresponding source prototype. With extensive experiments, our proposed method achieves noteworthy performance gain when applied on top of existing CTA methods without substantial adaptation time overhead.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (60)
  1. Pseudo-labeling and confirmation bias in deep semi-supervised learning. In 2020 International Joint Conference on Neural Networks (IJCNN), pp.  1–8. IEEE, 2020.
  2. Ttaps: Test-time adaption by aligning prototypes using self-supervision. In 2022 International Joint Conference on Neural Networks (IJCNN). IEEE, 2022.
  3. A probabilistic framework for lifelong test-time adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  3582–3591, 2023.
  4. Progressive feature alignment for unsupervised domain adaptation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  627–636, 2019.
  5. A simple framework for contrastive learning of visual representations. In International conference on machine learning, pp.  1597–1607. PMLR, 2020.
  6. Improving test-time adaptation via shift-agnostic weight regularization and nearest source prototypes, 2022.
  7. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pp.  248–255. Ieee, 2009.
  8. Robust mean teacher for continual and gradual test-time adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  7704–7714, 2023.
  9. Unsupervised domain adaptation by backpropagation. In International Conference on Machine Learning (ICML), 2015.
  10. Back to the source: Diffusion-driven adaptation to test-time corruption. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  11786–11796, 2023.
  11. Note: Robust continual test-time adaptation against temporal correlation, 2022.
  12. Sotta: Robust test-time adaptation on noisy data streams. arXiv preprint arXiv:2310.10074, 2023.
  13. Bootstrap your own latent-a new approach to self-supervised learning. Advances in neural information processing systems, 33:21271–21284, 2020.
  14. On calibration of modern neural networks. In International conference on machine learning, pp.  1321–1330. PMLR, 2017.
  15. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  9729–9738, 2020.
  16. Benchmarking neural network robustness to common corruptions and perturbations. Proceedings of the International Conference on Learning Representations, 2019.
  17. Augmix: A simple data processing method to improve robustness and uncertainty. arXiv preprint arXiv:1912.02781, 2019.
  18. The many faces of robustness: A critical analysis of out-of-distribution generalization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.  8340–8349, 2021.
  19. Mecta: Memory-economic continual test-time model adaptation. In The Eleventh International Conference on Learning Representations, 2022.
  20. Test-time classifier adjustment module for modelagnostic domain generalization. In Advances in Neural Information Processing Systems (NeurIPS), 2021.
  21. Test-time adaptation via self-training with nearest neighbor information. In International Conference on Learning Representations (ICLR), 2023.
  22. Cafa: Class-aware feature alignment for test-time adaptation. arXiv preprint arXiv:2206.00205, 2022.
  23. Self-training and adversarial background regularization for unsupervised domain adaptive one-stage object detection, 2019.
  24. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  25. Wilds: A benchmark of in-the-wild distribution shifts. In International Conference on Machine Learning, pp.  5637–5664. PMLR, 2021.
  26. Learning multiple layers of features from tiny images. 2009.
  27. Do we really need to access the source data? source hypothesis transfer for unsupervised domain adaptation. In International Conference on Machine Learning (ICML), 2020.
  28. Ttn: A domain-shift aware batch normalization in test-time adaptation, 2023.
  29. Cycle self-training for domain adaptation. In Advances in neural information processing systems, 2021a.
  30. Unbiased teacher for semi-supervised object detection. arXiv preprint arXiv:2102.09480, 2021b.
  31. Conditional adversarial domain adaptation. Advances in neural information processing systems, 31, 2018.
  32. Adversarial style mining for one-shot unsupervised domain adaptation, 2020.
  33. Actmad: Activation matching to align distributions for test-time-training. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  24152–24161, 2023.
  34. Obtaining well calibrated probabilities using bayesian binning. In Proceedings of the AAAI conference on artificial intelligence, volume 29, 2015.
  35. Efficient test-time model adaptation without forgetting, 2022.
  36. Towards stable test-time adaptation in dynamic wild world. arXiv preprint arXiv:2302.12400, 2023.
  37. Label shift adapter for test-time adaptation under covariate and label shifts. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.  16421–16431, 2023.
  38. Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32, pp.  8024–8035. Curran Associates, Inc., 2019. URL http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf.
  39. Diffusion-tta: Test-time adaptation of discriminative models via generative feedback. arXiv e-prints, pp.  arXiv–2311, 2023.
  40. Improving robustness against common corruptions by covariate shift adaptation. In Advances in neural information processing systems, 2020.
  41. Wasserstein distance guided representation learning for domain adaptation, 2018.
  42. A simple semi-supervised learning framework for object detection. arXiv preprint arXiv:2005.04757, 2020.
  43. Ecotta: Memory-efficient continual test-time adaptation via self-distilled regularization, 2023.
  44. Revisiting realistic test-time training: Sequential inference and adaptation by anchored clustering. Advances in Neural Information Processing Systems, 35:17543–17555, 2022.
  45. Deep coral: Correlation alignment for deep domain adaptation. In ECCV 2016, 2016.
  46. Test-time training with self-supervision for generalization under distribution shifts. In International Conference on Machine Learning (ICML), 2020.
  47. Discriminative adversarial domain adaptation, 2020.
  48. Adversarial discriminative domain adaptation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  7167–7176, 2017.
  49. Tent: Fully test-time adaptation by entropy minimization. In International Conference on Learning Representations (ICLR), 2020.
  50. Continual test-time domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  7201–7211, 2022.
  51. Feature alignment and uniformity for test time adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  20050–20060, 2023.
  52. Adversarial domain adaptation with domain mixup, 2020.
  53. When source-free domain adaptation meets learning with noisy labels. arXiv preprint arXiv:2301.13381, 2023.
  54. Robust test-time adaptation in dynamic scenarios. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  15922–15932, 2023a.
  55. Tea: Test-time energy adaptation. arXiv preprint arXiv:2311.14402, 2023b.
  56. Towards principled disentanglement for domain generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  8024–8034, 2022a.
  57. Memo: Test time robustness via adaptation and augmentation, 2022b.
  58. Delta: degradation-free fully test-time adaptation. arXiv preprint arXiv:2301.13018, 2023.
  59. Unpaired image-to-image translation using cycle-consistent adversarial networks. 2020.
  60. Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In European Conference on Computer Vision (ECCV), 2018.

Summary

  • The paper introduces a novel bias mitigation method using EMA prototypical loss and prototype matching for improved continual adaptation.
  • It aligns target and source distributions to enhance class-wise feature clustering and boosts accuracy on benchmarks like ImageNet-C and CIFAR100-C.
  • The approach integrates seamlessly with existing CTA methods, achieving robust performance improvements with minimal computational overhead.

Mitigating Bias in Continual Test-Time Adaptation for Improved Model Performance

Introduction to the Paper's Context and Contribution

In the ever-evolving domain of deep learning deployment, models often encounter streaming data whose distribution differs significantly from that of the training set. This disparity necessitates mechanisms for continual test-time adaptation (CTA) to maintain model accuracy over time. A particular challenge in CTA is the model's tendency to exhibit bias towards specific classes, leading to overconfident and inaccurate predictions. Addressing this, the research introduces a novel method designed to mitigate such biases, employing class-wise target prototypes and aligning target distributions to source distributions through prototype matching. The approach is seamlessly integrable with existing CTA methods, enhancing performance without notable adaptation time overhead.

Deep Dive into the Methodology

The proposed method comprises two primary components: the exponential moving average (EMA) target domain prototypical loss and source distribution alignment via prototype matching. The EMA prototypical loss leverages reliable target samples to continuously update each class prototype, which helps in class-wise clustering of the target features. This strategy effectively captures the changing target distributions and aims to prevent an undue bias towards current target distributions. Furthermore, to anchor the target data distribution closely to the source distribution, the paper proposes minimizing the distance between the target feature and its corresponding source prototype. This approach stands out for its simplicity, deviating from the complex metrics such as KL-Divergence typically employed for domain alignment.

Empirical Evidence and Observations

Extensive experiments validate the effectiveness of the proposed method on standard CTA benchmarks like ImageNet-C and CIFAR100-C. Notably, when applied atop existing methods, significant performance improvements are observed with minimal adaptation time overhead. The method not only enhances the average accuracy across various classes and conditions but also corrects the model's calibration, reducing overconfidence in predictions. Moreover, the research convincingly demonstrates the method's robustness to variations in the target domain's presentation order and its scalability across different batch sizes.

Theoretical and Practical Implications

On the theoretical front, this paper contributes to the understanding of bias in continual learning scenarios and offers a viable strategy for its mitigation. Practically, the ease of integrating the proposed method with existing approaches makes it highly relevant for real-world applications that require continual adaptation to changing data distributions. Furthermore, the negligible increase in adaptation time underscores the method's feasibility for deployment in time-sensitive applications.

Speculative Look into the Future

Given the promising results, future work could explore several directions. These include a more granular understanding of the mechanisms driving the observed improvements and extending the methodology to address other forms of bias that may arise in continual learning scenarios. Furthermore, investigating the interplay between the proposed method and different model architectures or types of streaming data could yield insights into achieving even more robust and versatile test-time adaptation strategies.

Conclusions

In summary, this research makes a significant contribution to the field of continual learning by addressing the issue of biased predictions during test-time adaptation. Through a simplified yet effective approach, it not only enhances model performance across a spectrum of real-world conditions but also does so with minimal computational overhead. Its compatibility with existing continual adaptation methods further underscores its practical utility, marking a step forward in the development of adaptable, fair, and accurate machine learning models.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.