Credal Sets: Modeling Epistemic Uncertainty

Updated 27 August 2025

Credal Sets are closed, convex sets of probability distributions that capture both inherent randomness and epistemic uncertainty.
They are constructed using interval probabilities, expert judgments, and data-driven methods to model uncertainty in fields like AI safety and reliability engineering.
They enable robust inference via convex optimization and advanced algorithmic frameworks, though at the cost of increased computational complexity.

Credal sets are closed, convex sets of probability distributions designed to rigorously represent epistemic uncertainty—arising from imprecision, ambiguity, ignorance, or conflict in probabilistic modeling—over a finite or continuous outcome space. Unlike traditional probabilistic approaches that assign a single distribution to model uncertainty (thereby distinguishing only aleatoric, or inherent, uncertainty), credal sets express both aleatoric and epistemic uncertainty by specifying all distributions in a plausibly admissible range as dictated by available information, expert judgment, or imprecise data. This framework is foundational in imprecise probability and has become central to robust statistical modeling, reliable machine learning, uncertainty quantification in graphical models, and critical domains such as reliability engineering and AI safety.

1. Mathematical Definition and Structural Properties

A credal set $\mathcal{K}$ on a finite outcome space $\mathcal{X}$ is defined as a closed, convex subset of the probability simplex: $\mathcal{K} \subseteq \Delta^{|\mathcal{X}| - 1} = \left\{ p \in \mathbb{R}^{|\mathcal{X}|} : p_i \geq 0, \sum_{i=1}^{|\mathcal{X}|} p_i = 1 \right\}.$ Convexity ensures that any mixture of admissible probability measures is itself admissible—a key epistemic property. When $\mathcal{K}$ is represented by a finite set of vertices (the extreme points), it is a convex polytope, also known as a finitely generated credal set (Caprio et al., 2023).

Credal sets admit several equivalent representations, including:

The vertex (V-) representation: as the convex hull of a (finite) set of extreme distributions $\mathrm{ext}(\mathcal{K})$ .
The half-space (H-) representation: as the intersection of finitely many linear inequalities.

Key notions include lower and upper expectations: $\underline{E}[f] = \inf_{p \in \mathcal{K}} E_p[f], \quad \overline{E}[f] = \sup_{p \in \mathcal{K}} E_p[f],$ and lower/upper probabilities for any event $A \subseteq \mathcal{X}$ : $\underline{P}(A) = \inf_{p \in \mathcal{K}} p(A), \quad \overline{P}(A) = \sup_{p \in \mathcal{K}} p(A).$

In infinite spaces, the definition generalizes— $\mathcal{K}$ becomes a closed, convex set of measures, but many algorithmic advantages and geometric insights from the polytopal case may be lost.

Geometric aspects such as normal cones, simplicial fans, and adjacency relations between convex polytope faces play a fundamental role in characterizing extremal points, supporting robust model construction and optimization over credal sets (Škulj, 2022). In 2-monotone or interval-probability models, the geometry is especially tractable, with extreme points corresponding to chain-based or interval-attaining measures.

2. Construction and Specification of Credal Sets

Credal sets can arise through various mechanisms:

Interval probabilities: For each elementary outcome $j \in \mathcal{X}$ , specify $l_j \leq p(j) \leq u_j$ with normalization. The credal set is the intersection of these constraints with the simplex.
Coherent lower previsions: Specify lower previsions for a set of gambles or events. The corresponding credal set is the set of distributions dominating all lower bounds.
Elicitation from expert opinion: Aggregating judgments from multiple sources yields a credal set as the convex hull of the provided distributions (Morveli-Espinoza et al., 2020).
Relative likelihood: Define the credal set as the set of models whose likelihood exceeds a given fraction $\alpha$ of the maximum likelihood, i.e., $C_\alpha = \{h: \gamma(h) \geq \alpha\}$ (Löhr et al., 28 May 2025).
Ensembling and data-driven methods: Deep ensembles, dropout networks, or specialized supervised learning techniques define credal sets as the convex hull of instance-wise ensemble outputs (Jürgens et al., 22 Feb 2025).
Conformal prediction procedures: Credal sets are obtained by thresholding the divergence from a model's prediction so as to guarantee coverage or calibration properties (Javanmardi et al., 16 Feb 2024, Huang et al., 10 Jan 2025).

The specification may be separate (e.g., specifying conditional credal sets $K(X|Y=y)$ for each value $y$ ) or extensive (e.g., specifying a set $L(X|Y)$ of full conditional probability functions), with nontrivial differences in induced joint distributions and potential for "unforeseen" vertices (Rocha et al., 2012).

The combination and composition of credal sets for multidimensional model construction follow the principles of valuation algebras and polyhedral geometry (Vejnarová et al., 2017, Ristic et al., 2022). For interval-based (box) credal sets, the intersection and Minkowski addition are efficient; more generally, quadratic or linear programming is required to handle joint constructions.

3. Inference and Computational Complexity

Inference in credal models is generally more demanding than in single-distribution frameworks:

Credal Bayesian networks and credal networks pose inference problems where marginal or conditional probabilities are sought over the entire strong extension of $\mathcal{K}$ (the set of all joint distributions consistent with graphical and local credal set constraints). This corresponds to nonlinear optimization over exponentially many combinations of local extremal distributions (Rocha et al., 2012, Wijk et al., 2022).
Complexity: Exact inference in credal networks—even for polytrees—is NP-hard (Rocha et al., 2012), due to the combinatorial explosion in the number of vertices. Even in tractable circuit-based models, the presence of credal parameters can markedly increase computational burden.
Algorithmic reductions: Recent advances such as separable variable elimination and terminal evidence reduction for credal polytrees can drastically reduce dimensionality by leveraging separability—preserving tractability in special cases (notably binary variable frameworks) (Rocha et al., 2012).

In credal valuation networks, valuation algebra operations such as combination (aggregation of independent sources) and marginalization (projection to subspaces) are performed over credal sets, often via bilinear or quadratic programs (Ristic et al., 2022). Compositional operations (e.g., vacuous extension, projection) are handled via polyhedral geometry tools (Vejnarová et al., 2017).

For deep learning and neural predictors, credal set inference at run time typically involves forward passes through interval-valued architectures or ensemble predictions, with subsequent uncertainty quantification via entropy bounds or calibration tests (Wang et al., 10 Jan 2024, Jürgens et al., 22 Feb 2025).

4. Uncertainty Quantification and Calibration

Credal sets provide a principled framework for uncertainty quantification, distinguishing:

Aleatoric uncertainty: Inherent randomness, captured by the minimal (lower) entropy or interval width within $\mathcal{K}$ .
Epistemic uncertainty: Model ignorance, captured by the "spread" of $\mathcal{K}$ (e.g., the difference between upper and lower expectations, or entropy gap $\overline{H}(\mathcal{K}) - \underline{H}(\mathcal{K})$ ) (Caprio et al., 2023, Wang et al., 10 Jan 2024).

Measures of credal set "size"—such as the volume of the polytope—are well-behaved for binary classification, where they correspond to the simple interval length, and enjoy desirable monotonicity, continuity, and invariance properties (Sale et al., 2023). However, in higher dimensions (multi-class problems), the volume can be highly sensitive to boundary perturbations and may poorly reflect true epistemic uncertainty, concentrating most of the measure near the boundary and being discontinuous as the effective affine dimension drops.

Calibration of credal predictors is a central practical concern:

Calibration tests: Nonparametric statistical tests can verify whether a convex combination of predictor outputs exists that is calibrated in distribution—i.e., if $P(Y = k | f(X) = s) = s_k$ (Jürgens et al., 22 Feb 2025).
Conformal prediction approaches: Yield finite-sample, distribution-free guarantees that clarifies coverage and calibration properties of credal set predictors in both supervised and self-supervised contexts (Javanmardi et al., 16 Feb 2024, Lienen et al., 2022, Huang et al., 10 Jan 2025).

Instance-dependent convex combinations in ensembles improve both predictive performance and the power of calibration tests, recognizing that the optimal epistemic blending of models can change over the input space (Jürgens et al., 22 Feb 2025).

5. Applications and Algorithmic Frameworks

Credal sets, due to their versatility in modeling epistemic uncertainty, underpin a range of statistical and machine learning applications:

Credal Bayesian Networks (CrBNs) and credal networks: Used in expert systems, reliability analysis, and robust decision support; enable robust inference under model parameter ambiguity (Rocha et al., 2012, Wijk et al., 2022).
Credal Model Averaging (CMA): Guards against prior sensitivity in Bayesian model averaging, particularly important in small sample regimes or in the presence of expert-discordant priors (Corani et al., 2014).
Credal Deep Learning: Finitely generated credal sets on neural network priors and likelihoods enable the construction of "infinite" BNN ensembles, yielding robust quantification and disentanglement of aleatoric and epistemic uncertainty, with strong empirical advantages in distribution shift and safety-critical domains (Caprio et al., 2023).
Self-supervised and semi-supervised learning: Credal sets provide uncertainty-aware pseudo-labeling, leading to improved calibration, especially in scarce label regimes (Lienen et al., 2021, Lienen et al., 2022).
Valuation and reasoning networks: Credal sets serve as valuations in graphical systems (valuation networks, sentential decision diagrams), supporting inference under uncertainty and logical constraints (Ristic et al., 2022, Mattei et al., 2020).
Hypothesis testing and statistical inference: Credal two-sample tests generalize classical tests to compare families of distributions with rigorous nonparametric permutation-based correction for epistemic uncertainty (Chau et al., 16 Oct 2024, Hibshman et al., 2021).
Domain adaptation and learning theory: Credal learning theory frames generalization guarantees in terms of plausible distribution sets, naturally yielding robustness to domain shift and variability (Caprio et al., 1 Feb 2024).

In practice, credal set methods are often coupled with convex optimization, linear or quadratic programming, and polyhedral computations to perform inference and uncertainty quantification.

6. Limitations, Challenges, and Future Directions

Credal set methodologies introduce significant mathematical and computational complexity. Notable limitations and ongoing challenges include:

Computational intractability: The explosion in the number of extreme points with model size imposes severe computational restrictions, especially for high-dimensional or richly structured credal sets (Rocha et al., 2012, Vejnarová et al., 2017).
Specification burden: Eliciting or inferring credal sets—whether through lower previsions, interval probabilities, or convex hulls of candidate models—may require substantial domain knowledge or advanced aggregation methods (Morveli-Espinoza et al., 2020, Corani et al., 2014).
Analysis of high-dimensional geometry: As dimensionality increases, intuition for volume or set-based uncertainty deteriorates (Sale et al., 2023).
Selection of combination and updating rules: Different credal composition or inference rules, such as those based on marginalization or vacuous extension, can yield markedly different downstream uncertainty quantification (Vejnarová et al., 2017).

Research continues on tractable algorithms for inference (e.g., via circuit compilation (Wijk et al., 2022)), calibration (nonparametric tests (Jürgens et al., 22 Feb 2025)), and the optimal design of credal sets for deep learning settings (Caprio et al., 2023), as well as the development of learning-theoretic frameworks that derive practical guarantees across all plausible distributions (Caprio et al., 1 Feb 2024). Addressing these open questions remains key for deploying credal set methods in safety-critical, data-scarce, or distribution-shifted settings.

7. Comparative Context and Theoretical Significance

Credal sets generalize single-probability models—Bayesian, frequentist, or otherwise—and other uncertainty frameworks such as Dempster–Shafer evidence theory (allowing for more flexibility and explicit representation of ignorance), possibility theory (with a less "coarse" encoding of uncertainty), and fuzzy set theory.

They also connect to robust optimization, imprecise Markov decision processes, and game-theoretic probability. In hypothesis testing, higher-order credal sets (modeling "uncertainty about uncertainty") can resolve classical paradoxes such as dilation and belief inertia, sometimes leading to the emergence of a unique, non-informative, total-variation-uniform prior (Hibshman et al., 2021).

The theory of credal sets, as both an epistemic and algorithmic tool, continues to expand its impact in uncertainty quantification, robust AI, and real-world decision systems, motivating research into tighter integration with learning theory and scalable computation.