Dropout Inference in Bayesian Neural Networks with Alpha-divergences (1703.02914v1)

Published 8 Mar 2017 in cs.LG and stat.ML

Abstract: To obtain uncertainty estimates with real-world Bayesian deep learning models, practical inference approximations are needed. Dropout variational inference (VI) for example has been used for machine vision and medical applications, but VI can severely underestimates model uncertainty. Alpha-divergences are alternative divergences to VI's KL objective, which are able to avoid VI's uncertainty underestimation. But these are hard to use in practice: existing techniques can only use Gaussian approximating distributions, and require existing models to be changed radically, thus are of limited use for practitioners. We propose a re-parametrisation of the alpha-divergence objectives, deriving a simple inference technique which, together with dropout, can be easily implemented with existing models by simply changing the loss of the model. We demonstrate improved uncertainty estimates and accuracy compared to VI in dropout networks. We study our model's epistemic uncertainty far away from the data using adversarial images, showing that these can be distinguished from non-adversarial images by examining our model's uncertainty.

Citations (189)

View on Semantic Scholar

Summary

The paper presents a novel dropout-based inference method using reparameterized BB-α divergence to better capture uncertainty in Bayesian neural networks.
It introduces an alpha-scaled loss that enhances test log-likelihood and provides more reliable uncertainty estimates compared to traditional variational inference.
Enhanced epistemic uncertainty detection in adversarial examples suggests the approach is promising for robust and secure AI applications.

Dropout Inference in Bayesian Neural Networks with Alpha-divergences

The paper "Dropout Inference in Bayesian Neural Networks with Alpha-divergences" by Yingzhen Li and Yarin Gal presents a novel approach to improve uncertainty estimates in Bayesian Neural Networks (BNNs) using dropout as a tool for variational inference. The authors propose a reparameterized version of Black-box alpha (BB- $\alpha$ ) divergence minimization that can be applied to dropout-based variational distributions, thereby alleviating the known issues with traditional variational inference (VI) in terms of underestimating model uncertainty.

Key Contributions

Alpha-divergence for BNNs: The paper discusses the limitations of VI, specifically how VI tends to severely underestimate model uncertainty. To address this, the authors explore alpha-divergences as an alternative to the Kullback-Leibler (KL) divergence used in conventional VI. Alpha-divergences can better encapsulate model uncertainty by incorporating measures that strike a balance between zero-forcing and mass-covering behaviors.
Reparameterized Loss: The authors introduce a reparameterization of the BB- $\alpha$ energy that allows an easier implementation compatible with dropout. This approach modifies the neural network loss by incorporating alpha as a scaling factor, effectively transforming the training objective without requiring significant architectural changes.
Evaluation and Results: The proposed approach demonstrated improved model uncertainty estimates and, in several cases, better test log-likelihood compared to VI. These improvements were documented across various BNN configurations and datasets, revealing the enhanced robustness of BNNs using BB- $\alpha$ compared to traditional models.
Adversarial Robustness: The paper suggests that BNNs integrated with alpha-divergences exhibit increased epistemic uncertainty on adversarial examples, effectively distinguishing them from non-adversarial examples. The enhanced uncertainty quantification implies potential applications in identifying adversarial attacks in neural networks.

Implications and Future Work

The incorporation of alpha-divergences with dropout provides a more reliable approach for quantifying uncertainty in BNNs, which is crucial for applications where uncertainty estimates are critical, such as medical diagnostics and autonomous systems. The improved robustness to adversarial examples suggests promising avenues for building more secure AI systems.

The authors' approach paves the way for further exploration into computational trade-offs and efficiency. The implications of this method extend to larger, more computationally intensive models typical of modern deep learning architectures. Future work could investigate the performance of this reparameterization in other variational Bayesian methods and assess its viability in real-time applications where model updates need to be both rapid and significantly reliable.

In conclusion, the paper outlines significant advancements in Bayesian deep learning by marrying alpha-divergences with practical aspects of dropout variational inference, promoting robustness and reliability in predictive modeling. This work provides a framework for further research into scalable and effective uncertainty quantification techniques in neural networks.

PDF Markdown

Dropout Inference in Bayesian Neural Networks with Alpha-divergences (1703.02914v1)

Summary

Dropout Inference in Bayesian Neural Networks with Alpha-divergences

Key Contributions

Implications and Future Work

Related Papers