Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 81 tok/s
Gemini 2.5 Pro 48 tok/s Pro
GPT-5 Medium 32 tok/s Pro
GPT-5 High 32 tok/s Pro
GPT-4o 99 tok/s Pro
Kimi K2 195 tok/s Pro
GPT OSS 120B 462 tok/s Pro
Claude Sonnet 4.5 35 tok/s Pro
2000 character limit reached

Transitional Uncertainty with Layered Intermediate Predictions (2405.17494v2)

Published 25 May 2024 in cs.LG

Abstract: In this paper, we discuss feature engineering for single-pass uncertainty estimation. For accurate uncertainty estimates, neural networks must extract differences in the feature space that quantify uncertainty. This could be achieved by current single-pass approaches that maintain feature distances between data points as they traverse the network. While initial results are promising, maintaining feature distances within the network representations frequently inhibits information compression and opposes the learning objective. We study this effect theoretically and empirically to arrive at a simple conclusion: preserving feature distances in the output is beneficial when the preserved features contribute to learning the label distribution and act in opposition otherwise. We then propose Transitional Uncertainty with Layered Intermediate Predictions (TULIP) as a simple approach to address the shortcomings of current single-pass estimators. Specifically, we implement feature preservation by extracting features from intermediate representations before information is collapsed by subsequent layers. We refer to the underlying preservation mechanism as transitional feature preservation. We show that TULIP matches or outperforms current single-pass methods on standard benchmarks and in practical settings where these methods are less reliable (imbalances, complex architectures, medical modalities).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (60)
  1. Advances in metric embedding theory. In Proceedings of the thirty-eighth annual ACM symposium on Theory of computing, pp.  271–286, 2006.
  2. Depth uncertainty in neural networks. Advances in neural information processing systems, 33:10620–10634, 2020.
  3. Wasserstein gan, 2017.
  4. Invertible residual networks. In International Conference on Machine Learning, pp.  573–582. PMLR, 2019.
  5. Towards open set deep networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  1563–1572, 2016.
  6. Measures of distortion for machine learning. Advances in Neural Information Processing Systems, 31, 2018.
  7. Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing). Wiley-Interscience, USA, 2006. ISBN 0471241954.
  8. Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516, 2014.
  9. Density estimation using real nvp. arXiv preprint arXiv:1605.08803, 2016.
  10. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
  11. Efficient and scalable bayesian neural nets with rank-1 factors. In International conference on machine learning, pp.  2782–2792. PMLR, 2020.
  12. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning, pp.  1050–1059. PMLR, 2016.
  13. Lightweight probabilistic deep networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.  3369–3378, 2018.
  14. Regularisation of neural networks by enforcing lipschitz continuity. Machine Learning, 110:393–416, 2021.
  15. Improved training of wasserstein gans. Advances in neural information processing systems, 30, 2017.
  16. On calibration of modern neural networks. In International conference on machine learning, pp.  1321–1330. PMLR, 2017.
  17. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  770–778, 2016.
  18. Why relu networks yield high-confidence predictions far away from the training data and how to mitigate the problem. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  41–50, 2019.
  19. Benchmarking neural network robustness to common corruptions and perturbations. arXiv preprint arXiv:1903.12261, 2019.
  20. i-revnet: Deep invertible networks. arXiv preprint arXiv:1802.07088, 2018.
  21. Evidential turing processes. CoRR, abs/2106.01216, 2021. URL https://arxiv.org/abs/2106.01216.
  22. Shallow-deep networks: Understanding and mitigating network overthinking. In International conference on machine learning, pp.  3301–3310. PMLR, 2019.
  23. What uncertainties do we need in bayesian deep learning for computer vision?, 2017.
  24. Regression quantiles. Econometrica, 46(1):33–50, 1978. ISSN 00129682, 14680262. URL http://www.jstor.org/stable/1913643.
  25. Learning multiple layers of features from tiny images. 2009.
  26. Backpropagated gradient representations for anomaly detection. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXI 16, pp.  206–226. Springer, 2020.
  27. Simple and scalable predictive uncertainty estimation using deep ensembles. Advances in neural information processing systems, 30, 2017.
  28. Local distance preservation in the gp-lvm through back constraints. In Proceedings of the 23rd International Conference on Machine Learning, ICML ’06, pp.  513–520, New York, NY, USA, 2006. Association for Computing Machinery. doi: 10.1145/1143844.1143909.
  29. Probing the purview of neural networks via gradient analysis. IEEE Access, 11:32716–32732, 2023.
  30. A simple unified framework for detecting out-of-distribution samples and adversarial attacks. Advances in neural information processing systems, 31, 2018.
  31. Enhancing the reliability of out-of-distribution image detection in neural networks. arXiv preprint arXiv:1706.02690, 2017.
  32. Simple and principled uncertainty estimation with deterministic deep learning via distance awareness. Advances in Neural Information Processing Systems, 33:7498–7512, 2020.
  33. Energy-based out-of-distribution detection, 2021.
  34. Enhanced isotropy maximization loss: Seamless and high-performance out-of-distribution detection simply replacing the softmax loss, 2022.
  35. Predictive uncertainty estimation via prior networks, 2018.
  36. Ensemble distribution distillation. arXiv preprint arXiv:1905.00076, 2019.
  37. Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957, 2018.
  38. Deep deterministic uncertainty: A new simple baseline. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  24384–24394, 2023.
  39. Obtaining well calibrated probabilities using bayesian binning. In Proceedings of the AAAI conference on artificial intelligence, volume 29, 2015.
  40. Reading digits in natural images with unsupervised feature learning. 2011.
  41. Learning local error bars for nonlinear regression. In Proceedings of the 7th International Conference on Neural Information Processing Systems, NIPS’94, pp.  489–496, Cambridge, MA, USA, 1994. MIT Press.
  42. Interval neural networks: Uncertainty scores. arXiv preprint arXiv:2003.11566, 2020.
  43. Practical deep learning with bayesian principles. Advances in neural information processing systems, 32, 2019.
  44. Revisiting one-vs-all classifiers for predictive uncertainty and out-of-distribution detection in neural networks, 2020.
  45. Metric learning and manifolds: Preserving the intrinsic geometry. Preprint Department of Statistics, University of Washington, 2012.
  46. On the practicality of deterministic epistemic uncertainty. In Chaudhuri, K., Jegelka, S., Song, L., Szepesvari, C., Niu, G., and Sabato, S. (eds.), Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, pp.  17870–17909. PMLR, 17–23 Jul 2022. URL https://proceedings.mlr.press/v162/postels22a.html.
  47. Introspective learning: A two-stage approach for inference in neural networks. In Advances in Neural Information Processing Systems.
  48. Early exit ensembles for uncertainty quantification. In Machine Learning for Health, pp.  181–195. PMLR, 2021.
  49. Evidential deep learning to quantify classification uncertainty. Advances in neural information processing systems, 31, 2018.
  50. A tutorial on conformal prediction. Journal of Machine Learning Research, 9(3), 2008.
  51. Opening the black box of deep neural networks via information. arXiv preprint arXiv:1703.00810, 2017.
  52. Single-model uncertainties for deep learning. Advances in Neural Information Processing Systems, 32, 2019.
  53. Single model uncertainty estimation via stochastic data centering. Advances in Neural Information Processing Systems, 35:8662–8674, 2022.
  54. The information bottleneck method. arXiv preprint physics/0004057, 2000.
  55. Patches are all you need? arXiv preprint arXiv:2201.09792, 2022.
  56. Uncertainty estimation using a single deep deterministic neural network. In International conference on machine learning, pp.  9690–9700. PMLR, 2020.
  57. On feature collapse and deep kernel learning for single forward pass uncertainty. arXiv preprint arXiv:2102.11409, 2021.
  58. Batchensemble: an alternative approach to efficient ensemble and lifelong learning. arXiv preprint arXiv:2002.06715, 2020.
  59. How good is the bayes posterior in deep neural networks really? arXiv preprint arXiv:2002.02405, 2020.
  60. Medmnist v2-a large-scale lightweight benchmark for 2d and 3d biomedical image classification. Scientific Data, 10(1):41, 2023.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 2 posts and received 1 like.