Papers
Topics
Authors
Recent
2000 character limit reached

Improving Entropy-Based Test-Time Adaptation from a Clustering View

Published 31 Oct 2023 in cs.AI | (2310.20327v6)

Abstract: Domain shift is a common problem in the realistic world, where training data and test data follow different data distributions. To deal with this problem, fully test-time adaptation (TTA) leverages the unlabeled data encountered during test time to adapt the model. In particular, entropy-based TTA (EBTTA) methods, which minimize the prediction's entropy on test samples, have shown great success. In this paper, we introduce a new clustering perspective on the EBTTA. It is an iterative algorithm: 1) in the assignment step, the forward process of the EBTTA models is the assignment of labels for these test samples, and 2) in the updating step, the backward process is the update of the model via the assigned samples. This new perspective allows us to explore how entropy minimization influences test-time adaptation. Accordingly, this observation can guide us to put forward the improvement of EBTTA. We propose to improve EBTTA from the assignment step and the updating step, where robust label assignment, similarity-preserving constraint, sample selection, and gradient accumulation are proposed to explicitly utilize more information. Experimental results demonstrate that our method can achieve consistent improvements on various datasets. Code is provided in the supplementary material.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. Béjar Alonso, J. 2013. K-means vs Mini Batch K-means: a comparison.
  2. Understanding batch normalization. Advances in neural information processing systems, 31.
  3. Convergence properties of the k-means algorithms. Advances in neural information processing systems, 7.
  4. Deep clustering for unsupervised learning of visual features. In Proceedings of the European conference on computer vision (ECCV), 132–149.
  5. A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert systems with applications, 40(1): 200–210.
  6. Mini batch K-Means clustering on large dataset. Int. J. Sci. Eng. Technol. Res, 4(07): 1356–1358.
  7. Contrastive test-time adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 295–305.
  8. Robust mean teacher for continual and gradual test-time adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7704–7714.
  9. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.
  10. Deep clustering with convolutional autoencoders. In Neural Information Processing: 24th International Conference, ICONIP 2017, Guangzhou, China, November 14-18, 2017, Proceedings, Part II 24, 373–382. Springer.
  11. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778.
  12. Benchmarking neural network robustness to common corruptions and perturbations. arXiv preprint arXiv:1903.12261.
  13. Deep embedding network for clustering. In 2014 22nd International conference on pattern recognition, 1532–1537. IEEE.
  14. Test-time classifier adjustment module for model-agnostic domain generalization. Advances in Neural Information Processing Systems, 34: 2427–2440.
  15. Test-Time Adaptation via Self-Training with Nearest Neighbor Information. In The Eleventh International Conference on Learning Representations.
  16. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
  17. Segment anything. arXiv preprint arXiv:2304.02643.
  18. Learning multiple layers of features from tiny images.
  19. Do we really need to access the source data? source hypothesis transfer for unsupervised domain adaptation. In International conference on machine learning, 6028–6039. PMLR.
  20. Deep gradient compression: Reducing the communication bandwidth for distributed training. arXiv preprint arXiv:1712.01887.
  21. Spectral clustering. In Data clustering, 177–200. Chapman and Hall/CRC.
  22. SwapPrompt: Test-Time Prompt Adaptation for Vision-Language Models. Advances in Neural Information Processing Systems, 36.
  23. The norm must go on: Dynamic unsupervised domain adaptation by normalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14765–14775.
  24. On spectral clustering: Analysis and an algorithm. Advances in neural information processing systems, 14.
  25. TIPI: Test Time Adaptation with Transformation Invariance. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 24162–24171.
  26. Efficient test-time model adaptation without forgetting. In International conference on machine learning, 16888–16905. PMLR.
  27. Towards stable test-time adaptation in dynamic wild world. arXiv preprint arXiv:2302.12400.
  28. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, 779–788.
  29. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28.
  30. Imagenet large scale visual recognition challenge. International journal of computer vision, 115: 211–252.
  31. Improving robustness against common corruptions by covariate shift adaptation. Advances in neural information processing systems, 33: 11539–11551.
  32. Spectralnet: Spectral clustering using deep neural networks. arXiv preprint arXiv:1801.01587.
  33. TeSLA: Test-Time Self-Learning With Automatic Adversarial Augmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 20341–20350.
  34. Von Luxburg, U. 2007. A tutorial on spectral clustering. Statistics and computing, 17: 395–416.
  35. Tent: Fully Test-Time Adaptation by Entropy Minimization. In International Conference on Learning Representations.
  36. Continual test-time domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7201–7211.
  37. Feature alignment and uniformity for test time adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 20050–20060.
  38. Deep spectral clustering using dual autoencoder network. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 4066–4075.
  39. Interpolation-based contrastive learning for few-label semi-supervised learning. IEEE Transactions on Neural Networks and Learning Systems.
  40. Sample-weighted clustering methods. Computers & mathematics with applications, 62(5): 2200–2208.
  41. Robust test-time adaptation in dynamic scenarios. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15922–15932.
  42. Wide residual networks. arXiv preprint arXiv:1605.07146.
  43. Memo: Test time robustness via adaptation and augmentation. Advances in Neural Information Processing Systems, 35: 38629–38642.
  44. On Pitfalls of Test-Time Adaptation. In International Conference on Machine Learning, volume 202, 42058–42080.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.