Restricted Bayesian Neural Network (2403.04810v3)
Abstract: Modern deep learning tools are remarkably effective in addressing intricate problems. However, their operation as black-box models introduces increased uncertainty in predictions. Additionally, they contend with various challenges, including the need for substantial storage space in large networks, issues of overfitting, underfitting, vanishing gradients, and more. This study explores the concept of Bayesian Neural Networks, presenting a novel architecture designed to significantly alleviate the storage space complexity of a network. Furthermore, we introduce an algorithm adept at efficiently handling uncertainties, ensuring robust convergence values without becoming trapped in local optima, particularly when the objective function lacks perfect convexity.
- Y. Kumar, K. Kaur, and G. Singh, “Machine learning aspects and its applications towards different research areas,” in 2020 International conference on computation, automation and knowledge management (ICCAKM). IEEE, 2020, pp. 150–156.
- D. Dhall, R. Kaur, and M. Juneja, “Machine learning: a review of the algorithms and its applications,” Proceedings of ICRIC 2019: Recent Innovations in Computing, pp. 47–63, 2020.
- P. Cunningham, M. Cord, and S. J. Delany, “Supervised learning,” in Machine learning techniques for multimedia: case studies on organization and retrieval. Springer, 2008, pp. 21–49.
- L. P. Kaelbling, M. L. Littman, and A. W. Moore, “Reinforcement learning: A survey,” Journal of artificial intelligence research, vol. 4, pp. 237–285, 1996.
- K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath, “Deep reinforcement learning: A brief survey,” IEEE Signal Processing Magazine, vol. 34, no. 6, pp. 26–38, 2017.
- R. C. Deo, “Machine learning in medicine,” Circulation, vol. 132, no. 20, pp. 1920–1930, 2015.
- A. Rajkomar, J. Dean, and I. Kohane, “Machine learning in medicine,” New England Journal of Medicine, vol. 380, no. 14, pp. 1347–1358, 2019.
- R. Culkin and S. R. Das, “Machine learning in finance: the case of deep learning for option pricing,” Journal of Investment Management, vol. 15, no. 4, pp. 92–100, 2017.
- A. Tizghadam, H. Khazaei, M. H. Moghaddam, Y. Hassan et al., “Machine learning in transportation,” 2019.
- P. Bhavsar, I. Safro, N. Bouaynaya, R. Polikar, and D. Dera, “Machine learning in transportation data analytics,” in Data analytics for intelligent transportation systems. Elsevier, 2017, pp. 283–307.
- S. Du, J. Lee, H. Li, L. Wang, and X. Zhai, “Gradient descent finds global minima of deep neural networks,” in International conference on machine learning. PMLR, 2019, pp. 1675–1685.
- P.-L. Loh and M. J. Wainwright, “Regularized m-estimators with nonconvexity: Statistical and algorithmic theory for local optima,” Advances in Neural Information Processing Systems, vol. 26, 2013.
- A. D. Gavrilov, A. Jordache, M. Vasdani, and J. Deng, “Preventing model overfitting and underfitting in convolutional neural networks,” International Journal of Software Science and Computational Intelligence (IJSSCI), vol. 10, no. 4, pp. 19–28, 2018.
- N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: a simple way to prevent neural networks from overfitting,” The journal of machine learning research, vol. 15, no. 1, pp. 1929–1958, 2014.
- P. Baldi and P. J. Sadowski, “Understanding dropout,” Advances in neural information processing systems, vol. 26, 2013.
- G. Pereyra, G. Tucker, J. Chorowski, Ł. Kaiser, and G. Hinton, “Regularizing neural networks by penalizing confident output distributions,” arXiv preprint arXiv:1701.06548, 2017.
- P. Izmailov, S. Vikram, M. D. Hoffman, and A. G. G. Wilson, “What are bayesian neural network posteriors really like?” in International conference on machine learning. PMLR, 2021, pp. 4629–4640.
- L. V. Jospin, H. Laga, F. Boussaid, W. Buntine, and M. Bennamoun, “Hands-on bayesian neural networks—a tutorial for deep learning users,” IEEE Computational Intelligence Magazine, vol. 17, no. 2, pp. 29–48, 2022.
- P. Izmailov, S. Vikram, M. D. Hoffman, and A. G. G. Wilson, “What are bayesian neural network posteriors really like?” in Proceedings of the 38th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, M. Meila and T. Zhang, Eds., vol. 139. PMLR, 18–24 Jul 2021, pp. 4629–4640. [Online]. Available: https://proceedings.mlr.press/v139/izmailov21a.html
- D. J. MacKay, “Bayesian neural networks and density networks,” Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, vol. 354, no. 1, pp. 73–80, 1995.
- Z. I. Botev, D. P. Kroese, R. Y. Rubinstein, and P. L’Ecuyer, “The cross-entropy method for optimization,” in Handbook of statistics. Elsevier, 2013, vol. 31, pp. 35–59.
- S. Sharma, S. Sharma, and A. Athaiya, “Activation functions in neural networks,” Towards Data Sci, vol. 6, no. 12, pp. 310–316, 2017.
- K. Das, J. Jiang, and J. Rao, “Mean squared error of empirical predictor,” 2004.
- Y. Ho and S. Wookey, “The real-world-weight cross-entropy loss function: Modeling the costs of mislabeling,” IEEE access, vol. 8, pp. 4806–4813, 2019.
- M. A. Error, “Mean absolute error,” Retrieved September, vol. 19, p. 2016, 2016.
- R. Hecht-Nielsen, “Theory of the backpropagation neural network,” in Neural networks for perception. Elsevier, 1992, pp. 65–93.
- P.-T. De Boer, P. Kroese Dirk, S. Mannor, and R. Rubinstein, “’a tutorial on the cross-entropy’,” Haifa: Department of Industrial Engineering, Technion–Israel Institute of Technology, 2003.
- D. P. Kroese, R. Y. Rubinstein, I. Cohen, S. Porotsky, and T. Taimre, “Cross-entropy method’,” European Journal of Operational Research, vol. 31, pp. 276–283, 2011.