Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Online Feature Updates Improve Online (Generalized) Label Shift Adaptation (2402.03545v3)

Published 5 Feb 2024 in cs.LG

Abstract: This paper addresses the prevalent issue of label shift in an online setting with missing labels, where data distributions change over time and obtaining timely labels is challenging. While existing methods primarily focus on adjusting or updating the final layer of a pre-trained classifier, we explore the untapped potential of enhancing feature representations using unlabeled data at test-time. Our novel method, Online Label Shift adaptation with Online Feature Updates (OLS-OFU), leverages self-supervised learning to refine the feature extraction process, thereby improving the prediction model. By carefully designing the algorithm, theoretically OLS-OFU maintains the similar online regret convergence to the results in the literature while taking the improved features into account. Empirically, it achieves substantial improvements over existing methods, which is as significant as the gains existing methods have over the baseline (i.e., without distribution shift adaptations).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. Maximum likelihood with bias-corrected calibration is hard-to-beat at label shift adaptation. In International Conference on Machine Learning, pp. 222–232. PMLR, 2020.
  2. Concrete problems in ai safety. arXiv preprint arXiv:1606.06565, 2016.
  3. Pseudo-labeling and confirmation bias in deep semi-supervised learning. In 2020 International Joint Conference on Neural Networks (IJCNN). IEEE, 2020.
  4. Regularized learning for domain adaptation under label shifts. In International Conference on Learning Representations, 2019.
  5. Optimal dynamic regret in proper online learning with strongly convex losses and beyond. In International Conference on Artificial Intelligence and Statistics, pp.  1805–1845. PMLR, 2022.
  6. Online label shift: Optimal dynamic regret meets practical algorithms. To appear at Advances in Neural Information Processing Systems, 2023.
  7. Adapting to online label shift with provable guarantees. Advances in Neural Information Processing Systems, 35:29960–29974, 2022.
  8. Non-stationary stochastic optimization. Operations research, 63(5):1227–1244, 2015.
  9. A simple framework for contrastive learning of visual representations. In International conference on machine learning, pp. 1597–1607. PMLR, 2020a.
  10. Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297, 2020b.
  11. An empirical study of training self-supervised vision transformers. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp.  9620–9629, Los Alamitos, CA, USA, oct 2021. IEEE Computer Society. doi: 10.1109/ICCV48922.2021.00950. URL https://doi.ieeecomputersociety.org/10.1109/ICCV48922.2021.00950.
  12. An Analysis of Single Layer Networks in Unsupervised Feature Learning. In AISTATS, 2011. https://cs.stanford.edu/~acoates/papers/coatesleeng_aistats_2011.pdf.
  13. Cinic-10 is not imagenet or cifar-10, 2018.
  14. A unified view of label shift estimation. arXiv preprint arXiv:2003.07554, 2020.
  15. Rlsbench: Domain adaptation under relaxed label shift. In International Conference on Machine Learning (ICML), 2023.
  16. Unsupervised representation learning by predicting image rotations. In International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=S1v4N2l0-.
  17. Semi-supervised learning by entropy minimization. In Saul, L., Weiss, Y., and Bottou, L. (eds.), Advances in Neural Information Processing Systems, volume 17. MIT Press, 2004a. URL https://proceedings.neurips.cc/paper_files/paper/2004/file/96f2b50b5d3613adf9c27049b2a888c7-Paper.pdf.
  18. Semi-supervised learning by entropy minimization. Advances in neural information processing systems, 17, 2004b.
  19. Covariate shift by kernel mean matching. Dataset shift in machine learning, 3(4):5, 2009.
  20. Bootstrap your own latent-a new approach to self-supervised learning. Advances in neural information processing systems, 33:21271–21284, 2020.
  21. On calibration of modern neural networks. In International conference on machine learning, pp. 1321–1330. PMLR, 2017.
  22. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  9729–9738, 2020.
  23. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  16000–16009, 2022.
  24. Eurosat: A novel dataset and deep learning benchmark for land use and land cover classification. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 12(7):2217–2226, 2019. doi: 10.1109/JSTARS.2019.2918242.
  25. Benchmarking neural network robustness to common corruptions and perturbations. arXiv preprint arXiv:1903.12261, 2019.
  26. Continuous manifold based adaptation for evolving visual domains. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.  867–874, 2014.
  27. Correcting sample selection bias by unlabeled data. In Advances in neural information processing systems, volume 19, pp.  601–608. Citeseer, 2006.
  28. Learning multiple layers of features from tiny images. 2009.
  29. Temporal ensembling for semi-supervised learning. arXiv preprint arXiv:1610.02242, 2016.
  30. Lee, D.-H. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. ICML 2013 Workshop: Challenges in Representation Learning, 2013.
  31. Lee, D.-H. et al. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Workshop on challenges in representation learning, ICML, volume 3, pp.  896. Atlanta, 2013.
  32. Support vector machines for classification in nonstandard situations. Machine learning, 46(1):191–202, 2002.
  33. Detecting and correcting for label shift with black box predictors. In International conference on machine learning, pp. 3122–3130. PMLR, 2018.
  34. Ttt++: When does self-supervised test-time training fail or thrive? Advances in Neural Information Processing Systems, 34:21808–21820, 2021.
  35. Virtual adversarial training: a regularization method for supervised and semi-supervised learning. IEEE transactions on pattern analysis and machine intelligence, 41(8):1979–1993, 2018.
  36. Online model distillation for efficient video inference. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.  3573–3582, 2019.
  37. Efficient test-time model adaptation without forgetting. In International conference on machine learning, pp. 16888–16905. PMLR, 2022.
  38. Dataset shift in machine learning. Mit Press, 2009.
  39. Adjusting the outputs of a classifier to new a priori probabilities: a simple procedure. Neural Computation, 14(1):21–41, 2002.
  40. On causal and anticausal learning. arXiv preprint arXiv:1206.6471, 2012.
  41. Shalev-Shwartz, S. 2012.
  42. Shimodaira, H. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of statistical planning and inference, 90(2):227–244, 2000.
  43. Test-time training with self-supervision for generalization under distribution shifts. In International conference on machine learning, pp. 9229–9248. PMLR, 2020.
  44. Domain adaptation with conditional distribution matching and generalized label shift. Advances in Neural Information Processing Systems, 33:19276–19289, 2020.
  45. Tent: Fully test-time adaptation by entropy minimization. arXiv preprint arXiv:2006.10726, 2020.
  46. Online adaptation to label distribution shift. Advances in Neural Information Processing Systems, 34:11340–11351, 2021.
  47. Self-training with noisy student improves imagenet classification. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.  10684–10695, 2020. doi: 10.1109/CVPR42600.2020.01070.
  48. Zadrozny, B. Learning and evaluating classifiers under sample selection bias. In Proceedings of the Twenty-First International Conference on Machine Learning, pp.  114, 2004.
  49. Domain adaptation under target and conditional shift. In International Conference on Machine Learning, pp. 819–827. PMLR, 2013.
  50. Adapting to continuous covariate shift via online density ratio estimation. arXiv preprint arXiv:2302.02552, 2023.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets