Dual Accuracy-Quality-Driven Neural Network for Prediction Interval Generation (2212.06370v4)
Abstract: Accurate uncertainty quantification is necessary to enhance the reliability of deep learning models in real-world applications. In the case of regression tasks, prediction intervals (PIs) should be provided along with the deterministic predictions of deep learning models. Such PIs are useful or "high-quality" as long as they are sufficiently narrow and capture most of the probability density. In this paper, we present a method to learn prediction intervals for regression-based neural networks automatically in addition to the conventional target predictions. In particular, we train two companion neural networks: one that uses one output, the target estimate, and another that uses two outputs, the upper and lower bounds of the corresponding PI. Our main contribution is the design of a novel loss function for the PI-generation network that takes into account the output of the target-estimation network and has two optimization objectives: minimizing the mean prediction interval width and ensuring the PI integrity using constraints that maximize the prediction interval probability coverage implicitly. Furthermore, we introduce a self-adaptive coefficient that balances both objectives within the loss function, which alleviates the task of fine-tuning. Experiments using a synthetic dataset, eight benchmark datasets, and a real-world crop yield prediction dataset showed that our method was able to maintain a nominal probability coverage and produce significantly narrower PIs without detriment to its target estimation accuracy when compared to those PIs generated by three state-of-the-art neural-network-based methods. In other words, our method was shown to produce higher-quality PIs.
- D. Ghimire, D. Kil, and S.-H. Kim, “A survey on efficient convolutional neural networks and hardware acceleration,” Electronics, vol. 11, no. 6, 2022.
- V. Buhrmester, D. Münch, and M. Arens, “Analysis of explainers of black box deep neural networks for computer vision: A survey,” Machine Learning and Knowledge Extraction, vol. 3, no. 4, pp. 966–989, 2021.
- M. J. Colbrook, V. Antun, and A. C. Hansen, “The difficulty of computing stable and accurate neural networks: On the barriers of deep learning and smale’s 18th problem,” National Academy of Sciences, vol. 119, no. 12, p. e2107151119, 2022.
- A. Zarnani, S. Karimi, and P. Musilek, “Quantile regression and clustering models of prediction intervals for weather forecasts: A comparative study,” Forecasting, vol. 1, no. 1, pp. 169–188, 2019.
- A. Ruospo and E. Sanchez, “On the reliability assessment of artificial neural networks running on AI-oriented MPSoCs,” Applied Sciences, vol. 11, no. 14, 2021.
- E. D. Meenken, C. M. Triggs, H. E. Brown, S. Sinton, J. Bryant, A. D. Noble, M. Espig, M. Sharifi, and D. M. Wheeler, “Bayesian hybrid analytics for uncertainty analysis and real-time crop management,” Agronomy Journal, vol. 113, no. 3, pp. 2491–2505, 2021.
- D. Tran, J. Liu, M. W. Dusenberry, D. Phan, M. Collier, J. Ren, K. Han, Z. Wang, Z. Mariet, H. Hu et al., “Plex: Towards reliability using pretrained large model extensions,” CoRR, vol. abs/2207.07411, 2022. [Online]. Available: arxiv.org/abs/2207.07411
- Y. Gal and Z. Ghahramani, “Dropout as a Bayesian approximation: Representing model uncertainty in deep learning,” in 33rd Int. Conf. on Machine Learning, 20–22 Jun 2016, pp. 1050–1059.
- A. Khosravi, S. Nahavandi, D. Creighton, and A. F. Atiya, “Comprehensive review of NN-based prediction intervals and new advances,” IEEE Trans. Neural Networks, vol. 22, no. 9, pp. 1341–1356, 2011.
- D. L. Shrestha and D. P. Solomatine, “Machine learning approaches for estimation of prediction interval for the model output,” Neural Networks, vol. 19, no. 2, pp. 225–235, 2006.
- A. Khosravi, S. Nahavandi, D. C. Creighton, and A. F. Atiya, “Lower upper bound estimation method for construction of neural network-based prediction intervals,” IEEE Trans. Neural Networks, vol. 22, no. 3, pp. 337–346, 2011.
- T. Pearce, A. Brintrup, M. Zaki, and A. Neely, “High-quality prediction intervals for deep learning: A distribution-free, ensembled approach,” in 35th Int. Conf. on Machine Learning, 2018, pp. 4072–4081.
- X. Zhang, Z. Shu, R. Wang, T. Zhang, and Y. Zha, “Short-term load interval prediction using a deep belief network,” Energies, vol. 11, no. 10, 10 2018.
- I. M. Galván, J. M. Valls, A. Cervantes, and R. Aler, “Multi-objective evolutionary optimization of prediction intervals for solar energy forecasting with neural networks,” Information Sciences, vol. 418-419, pp. 363–382, 2017.
- E. Simhayev, G. Katz, and L. Rokach, “Piven: A deep neural network for prediction intervals with specific value prediction,” cs.LG, vol. abs/2006.05139, 2020. [Online]. Available: arxiv.org/abs/2006.05139
- T. Salem, H. Langseth, and H. Ramampiaro, “Prediction intervals: Split normal mixture from quality-driven deep ensembles,” in 36th Conf. on Uncertainty in Artificial Intelligence, J. Peters and D. Sontag, Eds., vol. 124, 03–06 Aug 2020, pp. 1179–1187.
- M. Ganaie, M. Hu, A. Malik, M. Tanveer, and P. Suganthan, “Ensemble deep learning: A review,” Engineering Applications of Artificial Intelligence, vol. 115, p. 105151, 2022.
- L. R. Chai, “Uncertainty estimation in bayesian neural networks and links to interpretability,” Master’s thesis, Department of Engineering, University of Cambridge, 2018.
- D. M. Blei, A. Kucukelbir, and J. D. McAuliffe, “Variational Inference: A review for statisticians,” Journal of the American Statistical Association, vol. 112, no. 518, pp. 859–877, 2017.
- A. Wu, S. Nowozin, E. Meeds, R. Turner, J. Hernández-Lobato, and A. Gaunt, “Deterministic variational inference for robust Bayesian neural networks,” in 7th Int. Conf. on Learning Representations, 2019.
- P. Izmailov, W. Maddox, P. Kirichenko, T. Garipov, D. Vetrov, and A. Wilson, “Subspace inference for bayesian deep learning,” in 35th Uncertainty in Artificial Intelligence Conf., Jul 2020, pp. 1169–1179.
- J. Lu, J. Ding, C. Liu, and T. Chai, “Hierarchical-bayesian-based sparse stochastic configuration networks for construction of prediction intervals,” IEEE Trans. Neural Networks and Learning Systems, vol. 33, no. 8, pp. 3560–3571, 2022.
- J. Yao, W. Pan, S. Ghosh, and F. Doshi-Velez, “Quality of uncertainty quantification for bayesian neural network inference,” 2019.
- S. Farquhar, M. A. Osborne, and Y. Gal, “Radial bayesian neural networks: Beyond discrete support in large-scale bayesian deep learning,” in Twenty Third Int. Conf. on Artificial Intelligence and Statistics, ser. Proceedings of Machine Learning Research, S. Chiappa and R. Calandra, Eds., vol. 108, 26–28 Aug 2020, pp. 1352–1362.
- B. Lakshminarayanan, A. Pritzel, and C. Blundell, “Simple and scalable predictive uncertainty estimation using deep ensembles,” in Advances in Neural Information Processing Systems, vol. 30, 2017.
- L. Zhu and N. Laptev, “Deep and confident prediction for time series at Uber,” in IEEE Int. Conf. on Data Mining Workshops (ICDMW), 2017, pp. 103–110.
- J. Schupbach, J. W. Sheppard, and T. Forrester, “Quantifying uncertainty in neural network ensembles using u-statistics,” in 2020 Int. Joint Conf. on Neural Networks, 2020, pp. 1–8.
- A. Khosravi, S. Nahavandi, D. Srinivasan, and R. Khosravi, “Constructing optimal prediction intervals by using neural networks and bootstrap method,” IEEE Trans. on Neural Networks and Learning Systems, vol. 26, no. 8, pp. 1810–1815, 2015.
- J. Lu, J. Ding, X. Dai, and T. Chai, “Ensemble stochastic configuration networks for estimating prediction intervals: A simultaneous robust training algorithm and its application,” IEEE Trans. Neural Networks and Learning Systems, vol. 31, no. 12, pp. 5426–5440, 2020.
- D. Nix and A. Weigend, “Estimating the mean and variance of the target probability distribution,” in Proceedings of IEEE Int. Conf. on Neural Networks, vol. 1, 1994, pp. 55–60.
- N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: A simple way to prevent neural networks from overfitting,” Journal of Machine Learning Research, vol. 15, no. 56, pp. 1929–1958, 2014.
- K. F. Wallis, “The two-piece normal, binormal, or double gaussian distribution: Its origin and rediscoveries,” Statistical Science, vol. 29, pp. 106–112, 2014.
- D. Dua and C. Graff, “UCI machine learning repository,” 2019, https://archive.ics.uci.edu/ml/index.php.
- P. B. Hegedus, B. Maxwell, J. Sheppard, S. Loewen, H. Duff, G. Morales, and A. Peerlinck, “Towards a low-cost comprehensive process for on-farm precision experimentation and analysis,” Agriculture, vol. 13, no. 3, 2023.
- G. Morales, J. W. Sheppard, P. B. Hegedus, and B. D. Maxwell, “Improved yield prediction of winter wheat using a novel two-dimensional deep regression neural network trained via remote sensing,” Sensors, vol. 23, no. 1, p. 489, jan 2023.
- B. Lakshminarayanan, A. Pritzel, and C. Blundell, “Simple and scalable predictive uncertainty estimation using deep ensembles,” in 31st Int. Conf. on Neural Information Processing Systems, 2017, p. 6405–6416.
- Giorgio Morales (9 papers)
- John W. Sheppard (4 papers)