Papers
Topics
Authors
Recent
2000 character limit reached

Model and Feature Diversity for Bayesian Neural Networks in Mutual Learning (2407.02721v1)

Published 3 Jul 2024 in cs.LG and cs.CV

Abstract: Bayesian Neural Networks (BNNs) offer probability distributions for model parameters, enabling uncertainty quantification in predictions. However, they often underperform compared to deterministic neural networks. Utilizing mutual learning can effectively enhance the performance of peer BNNs. In this paper, we propose a novel approach to improve BNNs performance through deep mutual learning. The proposed approaches aim to increase diversity in both network parameter distributions and feature distributions, promoting peer networks to acquire distinct features that capture different characteristics of the input, which enhances the effectiveness of mutual learning. Experimental results demonstrate significant improvements in the classification accuracy, negative log-likelihood, and expected calibration error when compared to traditional mutual learning for BNNs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. Weight uncertainty in neural network. In ICML, 2015.
  2. Donald Bures. An extension of Kakutani’s theorem on infinite product measures to the tensor product of semifinite w∗superscript𝑤w^{*}italic_w start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT-algebras. Transactions of the American Mathematical Society, 135:199–212, 1969.
  3. Stochastic gradient hamiltonian monte carlo. In ICML, 2014.
  4. Feature-map-level online adversarial knowledge distillation. In ICML, 2020.
  5. Radial Bayesian Neural Networks: Beyond discrete support in large-scale Bayesian deep learning. AISTATS, 2020.
  6. A systematic comparison of Bayesian deep learning robustness in diabetic retinopathy tasks. In 4th workshop on Bayesian Deep Learning (NeurIPS 2019), 2019.
  7. Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. In ICML, 2016.
  8. Structured variational learning of Bayesian neural networks with horseshoe priors. In ICML, 2018.
  9. On calibration of modern neural networks. In ICML, 2017.
  10. Online knowledge distillation via collaborative learning. In CVPR, 2020.
  11. Deep residual learning for image recognition. In CVPR, 2016.
  12. A comprehensive overhaul of feature distillation. In ICCV, 2019.
  13. Distilling the Knowledge in a Neural Network. In NIPS Deep Learning and Representation Learning Workshop, 2014.
  14. Fast and scalable Bayesian deep learning by weight-perturbation in adam. In ICML, 2018.
  15. Feature fusion for online mutual knowledge distillation. In ICPR, 2021.
  16. Variational dropout and the local reparameterization trick. NeurIPS, 2015.
  17. Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. In ICLR, 2017.
  18. Bayesian dark knowledge. NeurIPS, 2015.
  19. Learning multiple layers of features from tiny images. Master’s thesis, University of Toronto, 2009.
  20. Cifar-10 (canadian institute for advanced research), 2010.
  21. Stein variational gradient descent: A general purpose Bayesian inference algorithm. NeurIPS, 2016.
  22. Radford M Neal. Bayesian learning for neural networks, volume 118. Springer Science & Business Media, 2012.
  23. Flat seeking Bayesian neural network. In NeurIPS, 2023.
  24. Gaussian variational approximation with a factor covariance structure. Journal of Computational and Graphical Statistics, 27(3):465–478, 2018.
  25. Learning deep representations with probabilistic knowledge transfer. In ECCV, 2018.
  26. A scalable laplace approximation for neural networks. In ICLR, 2018.
  27. FitNets: Hints for thin deep nets. In ICLR, 2015.
  28. Walsh-Hadamard variational inference for Bayesian deep learning. In NeurIPS, 2020.
  29. Imagenet large scale visual recognition challenge. IJCV, 2015.
  30. Variational learning of Bayesian neural networks via Bayesian dark knowledge. In IJCAI, 2021.
  31. The k-tied normal distribution: A compact parameterization of gaussian mean field posteriors in Bayesian neural networks. In ICML, 2020.
  32. Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 2008.
  33. Attention is all you need. NIPS, 30, 2017.
  34. Cédric Villani. Topics in optimal transportation. American Mathematical Soc., 2021.
  35. Cédric Villani et al. Optimal transport: old and new. Springer, 2009.
  36. Adversarial distillation of Bayesian neural network posteriors. In ICML, 2018.
  37. Bayesian learning via stochastic gradient langevin dynamics. In ICML, 2011.
  38. Amln: adversarial-based mutual learning network for online knowledge distillation. In ECCV, 2020.
  39. Deep mutual learning. In CVPR, 2018.
  40. Knowledge distillation by on-the-fly native ensemble. NeurIPS, 2018.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.