Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Unraveling Batch Normalization for Realistic Test-Time Adaptation (2312.09486v3)

Published 15 Dec 2023 in cs.CV and cs.LG

Abstract: While recent test-time adaptations exhibit efficacy by adjusting batch normalization to narrow domain disparities, their effectiveness diminishes with realistic mini-batches due to inaccurate target estimation. As previous attempts merely introduce source statistics to mitigate this issue, the fundamental problem of inaccurate target estimation still persists, leaving the intrinsic test-time domain shifts unresolved. This paper delves into the problem of mini-batch degradation. By unraveling batch normalization, we discover that the inexact target statistics largely stem from the substantially reduced class diversity in batch. Drawing upon this insight, we introduce a straightforward tool, Test-time Exponential Moving Average (TEMA), to bridge the class diversity gap between training and testing batches. Importantly, our TEMA adaptively extends the scope of typical methods beyond the current batch to incorporate a diverse set of class information, which in turn boosts an accurate target estimation. Built upon this foundation, we further design a novel layer-wise rectification strategy to consistently promote test-time performance. Our proposed method enjoys a unique advantage as it requires neither training nor tuning parameters, offering a truly hassle-free solution. It significantly enhances model robustness against shifted domains and maintains resilience in diverse real-world scenarios with various batch sizes, achieving state-of-the-art performance on several major benchmarks. Code is available at \url{https://github.com/kiwi12138/RealisticTTA}.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. Parameter-free online test-time adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8344–8353.
  2. Learning to transfer examples for partial domain adaptation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2985–2994.
  3. Contrastive test-time adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 295–305.
  4. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4): 834–848.
  5. RobustBench: a standardized adversarial robustness benchmark. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track.
  6. Robust mean teacher for continual and gradual test-time adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7704–7714.
  7. Test-time training with masked autoencoders. Advances in Neural Information Processing Systems, 35: 29374–29385.
  8. Unsupervised domain adaptation by backpropagation. In International Conference on Machine Learning, 1180–1189. PMLR.
  9. Deep residual learning for image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 770–778.
  10. Benchmarking neural network robustness to common corruptions and surface variations. arXiv preprint arXiv:1807.01697.
  11. AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty. In International Conference on Learning Representations.
  12. Cycada: Cycle-consistent adversarial domain adaptation. In International Conference on Machine Learning, 1989–1998. Pmlr.
  13. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning, 448–456. PMLR.
  14. Test-time classifier adjustment module for model-agnostic domain generalization. Advances in Neural Information Processing Systems, 34: 2427–2440.
  15. Sita: Single image test-time adaptation. arXiv preprint arXiv:2112.02355.
  16. ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6): 84–90.
  17. Do we really need to access the source data? source hypothesis transfer for unsupervised domain adaptation. In International Conference on Machine Learning, 6028–6039. PMLR.
  18. TTN: A Domain-Shift Aware Batch Normalization in Test-Time Adaptation. In International Conference on Learning Representations.
  19. TTT++: When does self-supervised test-time training fail or thrive? Advances in Neural Information Processing Systems, 34: 21808–21820.
  20. Transfer feature learning with joint distribution adaptation. In Proceedings of the IEEE International Conference on Computer Vision, 2200–2207.
  21. The norm must go on: Dynamic unsupervised domain adaptation by normalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14765–14775.
  22. Evaluating prediction-time batch normalization for robustness under covariate shift. arXiv preprint arXiv:2006.10963.
  23. Efficient test-time model adaptation without forgetting. In International Conference on Machine Learning, 16888–16905. PMLR.
  24. Towards Stable Test-time Adaptation in Dynamic Wild World. In International Conference on Learning Representations.
  25. Improving robustness against common corruptions by covariate shift adaptation. Advances in Neural Information Processing Systems, 33: 11539–11551.
  26. Test-time training with self-supervision for generalization under distribution shifts. In International Conference on Machine Learning, 9229–9248. PMLR.
  27. Tent: Fully Test-Time Adaptation by Entropy Minimization. In International Conference on Learning Representations.
  28. Continual test-time domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7201–7211.
  29. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1492–1500.
  30. Test-time batch statistics calibration for covariate shift. arXiv preprint arXiv:2110.04065.
  31. Universal domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2720–2729.
  32. Robust test-time adaptation in dynamic scenarios. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15922–15932.
  33. Wide residual networks. arXiv preprint arXiv:1605.07146.
  34. Memo: Test time robustness via adaptation and augmentation. Advances in Neural Information Processing Systems, 35: 38629–38642.
  35. DELTA: Degradation-free Fully Test-time Adaptation. In International Conference on Learning Representations.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com