Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 183 tok/s
Gemini 2.5 Pro 46 tok/s Pro
GPT-5 Medium 30 tok/s Pro
GPT-5 High 28 tok/s Pro
GPT-4o 82 tok/s Pro
Kimi K2 213 tok/s Pro
GPT OSS 120B 457 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Beyond Top-Class Agreement: Using Divergences to Forecast Performance under Distribution Shift (2312.08033v1)

Published 13 Dec 2023 in cs.LG and cs.AI

Abstract: Knowing if a model will generalize to data 'in the wild' is crucial for safe deployment. To this end, we study model disagreement notions that consider the full predictive distribution - specifically disagreement based on Hellinger distance, Jensen-Shannon and Kullback-Leibler divergence. We find that divergence-based scores provide better test error estimates and detection rates on out-of-distribution data compared to their top-1 counterparts. Experiments involve standard vision and foundation models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (24)
  1. Agreement-on-the-line: Predicting the performance of neural networks under distribution shift. Advances in Neural Information Processing Systems, 35:19274–19289, 2022.
  2. Yaofo Chen. chenyaofo/pytorch-cifar-models, November 2021.
  3. Repvgg: Making vgg-style convnets great again. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 13733–13742, 2021.
  4. Diversity with cooperation: Ensemble methods for few-shot classification. In Proceedings of the IEEE/CVF international conference on computer vision, pages 3723–3731, 2019.
  5. Leveraging unlabeled data to predict out-of-distribution performance. arXiv preprint arXiv:2201.04234, 2022.
  6. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  7. Scaling out-of-distribution detection for real-world settings. arXiv preprint arXiv:1911.11132, 2019.
  8. Benchmarking neural network robustness to common corruptions and perturbations. arXiv preprint arXiv:1903.12261, 2019.
  9. A baseline for detecting misclassified and out-of-distribution examples in neural networks. arXiv preprint arXiv:1610.02136, 2016.
  10. Assessing generalization of sgd via disagreement. arXiv preprint arXiv:2106.13799, 2021.
  11. A note on" assessing generalization of sgd via disagreement". arXiv preprint arXiv:2202.01851, 2022.
  12. Cifar-10 (canadian institute for advanced research).
  13. Cifar-100 (canadian institute for advanced research).
  14. Predicting out-of-distribution error with confidence optimal transport. arXiv preprint arXiv:2302.05018, 2023.
  15. Shufflenet v2: Practical guidelines for efficient cnn architecture design. In Proceedings of the European conference on computer vision (ECCV), pages 116–131, 2018.
  16. Uncertainty estimation in autoregressive structured prediction. arXiv preprint arXiv:2002.07650, 2020.
  17. Active learning for probability estimation using jensen-shannon divergence. In Machine Learning: ECML 2005: 16th European Conference on Machine Learning, Porto, Portugal, October 3-7, 2005. Proceedings 16, pages 268–279. Springer, 2005.
  18. Accuracy on the line: on the strong correlation between out-of-distribution and in-distribution generalization. In International Conference on Machine Learning, pages 7721–7735. PMLR, 2021.
  19. Distributional generalization: A new kind of generalization. arXiv preprint arXiv:2009.08092, 2020.
  20. K Nigam. Employing em in pool-based active learning for text classification. machine learning. In Proceeding (s) of the Fifteenth International Conference (ICML’98), pages 350–358, 1998.
  21. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
  22. (almost) provable error bounds under distribution shift via disagreement discrepancy. arXiv preprint arXiv:2306.00312, 2023.
  23. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4510–4520, 2018.
  24. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.