Bayesian Neural Networks with Domain Knowledge Priors (2402.13410v1)

Published 20 Feb 2024 in cs.LG and stat.ML

Abstract: Bayesian neural networks (BNNs) have recently gained popularity due to their ability to quantify model uncertainty. However, specifying a prior for BNNs that captures relevant domain knowledge is often extremely challenging. In this work, we propose a framework for integrating general forms of domain knowledge (i.e., any knowledge that can be represented by a loss function) into a BNN prior through variational inference, while enabling computationally efficient posterior inference and sampling. Specifically, our approach results in a prior over neural network weights that assigns high probability mass to models that better align with our domain knowledge, leading to posterior samples that also exhibit this behavior. We show that BNNs using our proposed domain knowledge priors outperform those with standard priors (e.g., isotropic Gaussian, Gaussian process), successfully incorporating diverse types of prior information such as fairness, physics rules, and healthcare knowledge and achieving better predictive performance. We also present techniques for transferring the learned priors across different model architectures, demonstrating their broad utility across various settings.

References (59)

Citations (6)

View on Semantic Scholar

Summary

The paper integrates domain knowledge as a loss function into BNN priors, improving model calibration and uncertainty estimation.
It employs variational inference to align weight distributions with domain-specific constraints for more reliable predictions.
Empirical results on diverse tasks, including DecoyMNIST, demonstrate enhanced performance and adaptability across architectures.

Bayesian Neural Networks with Domain Knowledge Priors

This paper introduces a framework for enhancing Bayesian Neural Networks (BNNs) by integrating domain knowledge into their priors. The authors focus on the challenge of leveraging prior knowledge, which can be expressed as a loss function, to improve the performance and reliability of BNNs. Through variational inference, the proposed approach aims to embed a prior distribution over neural network weights that aligns closely with available domain knowledge, thereby facilitating more informative posterior samples.

The utilization of BNNs has gained traction due to their capacity to capture model uncertainty, which is crucial in decision-making processes in sensitive domains such as healthcare or criminal justice. However, selecting an appropriate prior that reflects domain-specific insights remains difficult, especially given the high-dimensional nature of neural network weights. Traditionally, uninformative priors like isotropic Gaussian distributions are used, yet these do not exploit potential domain-specific knowledge.

The framework proposed in this paper addresses this gap by formulating domain knowledge as a loss function, denoted as $\phi$ . This loss function evaluates how well a model adheres to the given domain knowledge. For example, in physics-influenced models, $\phi$ could represent the degree to which a model's predictions violate physical laws like energy conservation. The prior over network weights is learned such that it assigns higher probability mass to weight configurations resulting in models that minimize $\phi$ .

Empirical evaluations demonstrate that BNNs incorporating these domain knowledge-informed priors outperform those with conventional priors across various datasets. The framework was tested on tasks incorporating diverse types of prior information, such as feature importance in image classification, fairness in decision-making, clinical rules in healthcare interventions, and physical laws in dynamical systems. For instance, on the DecoyMNIST task, the proposed method improved accuracy by effectively capturing domain knowledge that encouraged ignoring spurious background features.

Additionally, the paper explores mechanisms for transferring learned informative priors across different BNN architectures. Techniques based on maximum mean discrepancy (MMD) and moment matching are proposed, allowing learned priors to be efficiently adapted to new architectures, thereby increasing their practical utility without direct access to the original domain knowledge loss function.

The investigation into informative priors presents practical implications for deploying BNNs in real-world applications where domain knowledge can be crucial for informed decision-making. Theoretical implications mainly revolve around enhancing the representational power of Bayesian approaches in neural networks by systematically leveraging structured prior information.

In conclusion, the paper offers significant advancements in BNN methodology by embedding domain knowledge into prior distributions, facilitating improved model performance, and allowing for generalizability across various network architectures. As the field progresses, further exploration into automating the specification of domain knowledge loss functions and extending the framework to more complex model architectures and domains could prove valuable in the evolution of BNN applications.

Tweets

https://twitter.com/StatMLPapers/status/1760531087552311777

Bayesian Neural Networks with Domain Knowledge Priors (2402.13410v1)

Summary

Bayesian Neural Networks with Domain Knowledge Priors

Related Papers

Tweets