Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Understanding Pathologies of Deep Heteroskedastic Regression (2306.16717v2)

Published 29 Jun 2023 in stat.ML and cs.LG

Abstract: Deep, overparameterized regression models are notorious for their tendency to overfit. This problem is exacerbated in heteroskedastic models, which predict both mean and residual noise for each data point. At one extreme, these models fit all training data perfectly, eliminating residual noise entirely; at the other, they overfit the residual noise while predicting a constant, uninformative mean. We observe a lack of middle ground, suggesting a phase transition dependent on model regularization strength. Empirical verification supports this conjecture by fitting numerous models with varying mean and variance regularization. To explain the transition, we develop a theoretical framework based on a statistical field theory, yielding qualitative agreement with experiments. As a practical consequence, our analysis simplifies hyperparameter tuning from a two-dimensional to a one-dimensional search, substantially reducing the computational burden. Experiments on diverse datasets, including UCI datasets and the large-scale ClimSim climate dataset, demonstrate significantly improved performance in various calibration tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. A review of uncertainty quantification in deep learning: Techniques, applications and challenges. Information Fusion, 76:243–297, December 2021. ISSN 1566-2535. 10.1016/j.inffus.2021.05.008.
  2. Condensed Matter Field Theory. Cambridge University Press, Cambridge, 2 edition, 2010. ISBN 978-0-521-76975-4. 10.1017/CBO9780511789984.
  3. Heteroskedasticity in Multiple Regression Analysis: What it is, How to Detect it and How to Solve it with Applications in R and SPSS. Practical Assessment, Research, and Evaluation, 24(1), November 2019. ISSN 1531-7714. https://doi.org/10.7275/q5xr-fr95.
  4. Beyond Pinball Loss: Quantile Methods for Calibrated Uncertainty Quantification. 2021.
  5. Why neural networks find simple solutions: the many regularizers of geometric complexity. 2022.
  6. Density Functional Theory: An Advanced Course. Theoretical and Mathematical Physics. Springer, Berlin, Heidelberg, 2011. ISBN 978-3-642-14089-1 978-3-642-14090-7. 10.1007/978-3-642-14090-7.
  7. Detecting Heteroscedasticity in Nonparametric Regression. Journal of the Royal Statistical Society: Series B (Methodological), 55(1):145–155, 1993. ISSN 2517-6161. 10.1111/j.2517-6161.1993.tb01474.x. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.2517-6161.1993.tb01474.x.
  8. Bengt Fornberg. Generation of Finite Difference Formulas on Arbitrarily Spaced Grids. Mathematics of Computation, 51:699–706, October 1988.
  9. Deep Classifiers with Label Noise Modeling and Distance Awareness. Transactions on Machine Learning Research, 2022.
  10. J. Gerritsma. Geometry, resistance and stability of the Delft Systematic Yacht hull series. TU Delft, Faculty of Marine Technology, Ship Hydromechanics Laboratory, Report No. 520-P, Published in: International Shipbuilding Progress, ISP, Delft, The Netherlands, Volume 28, No. 328, also 7th HISWA Symposium, Amsterdam, The Netherlands, 1981.
  11. Hedonic housing prices and the demand for clean air. Journal of Environmental Economics and Management, 5(1):81–102, March 1978. ISSN 0095-0696. 10.1016/0095-0696(78)90006-2.
  12. Kurt Hornik. Approximation capabilities of multilayer feedforward networks. Neural Networks, 4(2):251–257, January 1991. ISSN 0893-6080. 10.1016/0893-6080(91)90009-T.
  13. Peter J. Huber. The behavior of maximum likelihood estimates under nonstandard conditions. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics, 5.1:221–234, January 1967. Publisher: University of California Press.
  14. Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods. Machine Learning, 110(3):457–506, March 2021. ISSN 1573-0565. 10.1007/s10994-021-05946-3.
  15. Effective Bayesian Heteroscedastic Regression with Deep Neural Networks. 2023.
  16. The UCI Machine Learning Repository. URL https://archive.ics.uci.edu.
  17. What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017.
  18. Neural Active Learning on Heteroskedastic Distributions, November 2022. arXiv:2211.00928 [cs].
  19. Accurate Uncertainties for Deep Learning Using Calibrated Regression. 2018.
  20. Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles. In Neural Information Processing Systems, 2017.
  21. Statistical Physics: Volume 5. Elsevier, October 2013. ISBN 978-0-08-057046-4.
  22. Heteroscedastic Gaussian process regression. In Proceedings of the 22nd international conference on Machine learning - ICML ’05, pages 489–496, Bonn, Germany, 2005. ACM Press. ISBN 978-1-59593-180-1. 10.1145/1102351.1102413.
  23. Evaluating and Calibrating Uncertainty Prediction in Regression Tasks. Sensors (Basel, Switzerland), 22(15):5540, July 2022. ISSN 1424-8220. 10.3390/s22155540. URL https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9330317/.
  24. Estimating the mean and variance of the target probability distribution. In Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN’94), volume 1, pages 55–60 vol.1, June 1994. 10.1109/ICNN.1994.374138.
  25. ON THE PITFALLS OF HETEROSCEDASTIC UNCERTAINTY ESTIMATION WITH PROBABILISTIC NEURAL NETWORKS. 2022.
  26. Reliable training and estimation of variance networks. 2019.
  27. Variational Variance: Simple, Reliable, Calibrated Heteroscedastic Noise Variance Parameterization, October 2020. arXiv:2006.04910 [cs, stat].
  28. Faithful Heteroscedastic Regression with Neural Networks. 2023.
  29. Student-t Variational Autoencoder for Robust Density Estimation. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, pages 2696–2702, Stockholm, Sweden, July 2018. International Joint Conferences on Artificial Intelligence Organization. ISBN 978-0-9992411-2-7. 10.24963/ijcai.2018/374.
  30. Pınar Tüfekci. Prediction of full load electrical power output of a base load operated combined cycle power plant using machine learning methods. International Journal of Electrical Power & Energy Systems, 60:126–140, September 2014. ISSN 0142-0615. 10.1016/j.ijepes.2014.02.027.
  31. Stanislaus S. Uyanto. Monte Carlo power comparison of seven most commonly used heteroscedasticity tests. Communications in Statistics - Simulation and Computation, 51(4):2065–2082, April 2022. ISSN 0361-0918. 10.1080/03610918.2019.1692031. Publisher: Taylor & Francis _eprint: https://doi.org/10.1080/03610918.2019.1692031.
  32. Robust probabilistic modeling with Bayesian data reweighting. In Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML’17, pages 3646–3655, Sydney, NSW, Australia, August 2017. JMLR.org.
  33. I-Cheng Yeh. Concrete Compressive Strength, 2007.
  34. ClimSim: An open large-scale dataset for training high-resolution physics emulators in hybrid multi-scale climate simulators, September 2023. URL http://arxiv.org/abs/2306.08754. arXiv:2306.08754 [physics].
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com