Conditional Belief Function

Updated 11 April 2026

Conditional belief functions are generalizations of conditional probability in the Dempster–Shafer framework that represent epistemic uncertainty using multiple valid constructions.
They incorporate diverse conditioning methods—geometric, envelope-based, and Jeffrey-like—that yield distinct computational and interpretational benefits.
Their use in evidential networks facilitates efficient inference and message passing while addressing challenges like non-proper mass assignments.

A conditional belief function generalizes the concept of conditional probability to belief function theory, enabling the representation and propagation of epistemic uncertainty in complex domains. Unlike classical probability, where conditioning is uniquely and unambiguously defined by Bayes’ rule, conditional belief in the Dempster–Shafer framework admits several mathematically and conceptually distinct constructions, each with rigorous implications for graphical modeling, inference algorithms, and the interpretation of uncertainty.

1. Formal Definition of Conditional Belief Functions

Let $U$ be a finite set of variables, and $A$ , $B$ be disjoint subsets of $U$ with Cartesian-product frames $\Omega_A$ and $\Omega_B$ . A (possibly unnormalized) basic belief assignment (bba) is a function $m_B: 2^{\Omega_B} \to [0,1]$ , with $\sum_{Y \subseteq \Omega_B} m_B(Y) = 1$ , $m_B(\varnothing) \ge 0$ . The induced belief and plausibility functions are: $\Bel_B(X) = \sum_{Y \subseteq X} m_B(Y)\,, \qquad \Pl_B(X) = \sum_{Y \cap X \neq \varnothing} m_B(Y)\,.$

A conditional belief function on $A$ 0 given $A$ 1 is a family $A$ 2 of bbas $A$ 3 with $A$ 4 for every $A$ 5. The conditional belief and plausibility are: $A$ 6 This notion admits local (per- $A$ 7) normalization without any global coupling across $A$ 8 (Xu et al., 2013).

2. Joint and Conditional Representations: Conversion and Consistency

Given a joint belief function $A$ 9, a conditional family $B$ 0 can be obtained by marginalization:

$B$ 1

where $B$ 2, $B$ 3 denote projections.

Conversely, the ballooning extension constructs a unique joint bba $B$ 4 given a family $B$ 5 by: $B$ 6 for $B$ 7. Not all conditional families admit a globally consistent joint; a necessary consistency requirement is

$B$ 8

This conversion is fundamental to evidential network construction and probabilistic graphical model generalizations (Xu et al., 2013).

3. Conditioning in Belief Functions: Approaches and Properties

Dempster’s Rule and Generalizations

The standard Dempster–Shafer conditioning formula, for evidence $B$ 9 with $U$ 0, is

$U$ 1

and

$U$ 2

This operation is a Bayes-style reallocation of mass, not requiring Dempster’s rule of combination for independent evidence—a distinction emphasized in (Kerkvliet et al., 2015).

Recent developments have advanced alternative conditionings:

Geometric Conditioning: Interprets conditioning as projecting the mass vector onto the face of the simplex corresponding to the conditioning event, minimizing a suitable $U$ 3 norm (e.g., $U$ 4 for equal redistribution, $U$ 5 for sparsest shift) (Cuzzolin, 2021).
Envelope-based Conditioning: Defines conditional belief as the lower envelope (infimum) of conditional probabilities compatible with the initial belief function, yielding closed-form expressions such as $U$ 6 (Fagin et al., 2013).
Jeffrey-like Conditioning: Generalized to belief functions, yielding "geometric" and "Dempster" Jeffrey rules. The geometric rule preserves proportional allocations within coarse atoms, the Dempster rule redistributes mass after intersection with the event (Smets, 2013).

Each construction has distinct epistemological and computational implications.

4. Conditional Belief Functions in Evidential Networks

Evidential networks with conditional belief functions (ENCs) are directed acyclic graphs $U$ 7 where each edge $U$ 8 is annotated with a conditional belief function $U$ 9 for all $\Omega_A$ 0. The network encodes

$\Omega_A$ 1

with no explicit joint stored (Xu et al., 2013).

Message Passing: For marginal $\Omega_A$ 2 of node $\Omega_A$ 3 in a singly connected network: $\Omega_A$ 4 where $\Omega_A$ 5 is Dempster’s conjunctive combination and $\Omega_A$ 6 is computed via back-projection through the relevant conditional: $\Omega_A$ 7 (Xu et al., 2013). Complexity in a tree is $\Omega_A$ 8. For networks with loops, node merging (ballooning) is used to restore polytree structure at exponential cost in merged frame size.

5. Conditioning and Inference: Practical and Theoretical Issues

Loss of Properness and Negative Mass

Standard conditioning operations in DST can yield non-proper belief functions: removal (division) operations may produce negative mass assignments, making semantic interpretation and statistical testing problematic.

Example: For joint belief functions over triple variables $\Omega_A$ 9 with positive mass, there may exist no non-negative mass functions $\Omega_B$ 0 such that $\Omega_B$ 1, with negative masses arising in attempted “conditionally independent” factorizations (Matuszewski et al., 2018).

Workarounds: Frequency-based Factorization

The $\Omega_B$ 2-measure provides a workaround: $\Omega_B$ 3 This function remains non-negative and supports log-linear and $\Omega_B$ 4 statistical tests, enabling empirical detection of (conditional) independence from data despite the non-properness obstacle in standard conditionals (Matuszewski et al., 2018).

6. Special Constructions and Applications

Simplification and Pruning in Evidential Networks

Certain conditional belief function structures allow substantial simplifications:

Irrelevant Value Pruning: If $\Omega_B$ 5 is irrelevant for $\Omega_B$ 6 (i.e., $\Omega_B$ 7), then evidence on $\Omega_B$ 8 carries no information for $\Omega_B$ 9 and edges can be pruned from computation (Xu et al., 2013).
Unrelated via Intermediary: If sets of values of $m_B: 2^{\Omega_B} \to [0,1]$ 0 relevant to $m_B: 2^{\Omega_B} \to [0,1]$ 1 and $m_B: 2^{\Omega_B} \to [0,1]$ 2 are disjoint, observation of $m_B: 2^{\Omega_B} \to [0,1]$ 3 has no impact on $m_B: 2^{\Omega_B} \to [0,1]$ 4 given $m_B: 2^{\Omega_B} \to [0,1]$ 5 (Xu et al., 2013).

Canonical Forms: Belief Noisy-OR

BNOR provides a parameter-reduced method to define conditional belief functions in analog to the Noisy-OR gate for Bayesian networks. It encodes both aleatory and epistemic uncertainty and reduces exponential parameterizations to linear scale in the number of parents, yielding conditional bb a's with focal elements reflecting lower, upper, and vacuous knowledge (Zhou et al., 2016).

Non-Destructive Sampling

Efficient sample generation from conditional belief networks is feasible for a restricted class of conditional beliefs with non-negative $m_B: 2^{\Omega_B} \to [0,1]$ 6-representation, compatible with Bayesian network-like topologies where no two parents of a node are connected, using value splits and surrogate probability tables (Kłopotek, 2020).

7. Worked Example: Conditional Belief Update

Consider variables $m_B: 2^{\Omega_B} \to [0,1]$ 7 with binary frames and conditionals:

$m_B: 2^{\Omega_B} \to [0,1]$ 8
$m_B: 2^{\Omega_B} \to [0,1]$ 9 and similar for $\sum_{Y \subseteq \Omega_B} m_B(Y) = 1$ 0 (Xu et al., 2013). Given observations $\sum_{Y \subseteq \Omega_B} m_B(Y) = 1$ 1 and a vacuous prior on $\sum_{Y \subseteq \Omega_B} m_B(Y) = 1$ 2:
Compute local messages, e.g., $\sum_{Y \subseteq \Omega_B} m_B(Y) = 1$ 3
Aggregate at $\sum_{Y \subseteq \Omega_B} m_B(Y) = 1$ 4 to yield posterior $\sum_{Y \subseteq \Omega_B} m_B(Y) = 1$ 5

All inference is performed via localized computations, without need for the global joint, highlighting the efficiency and modularity of conditional belief representations.

Conditional belief functions unify the expressive power of belief function theory with the compositional machinery of graphical models, supporting efficient inference, modular knowledge elicitation, and principled handling of epistemic uncertainty. Their theoretical development, practical propagation algorithms, and emerging workarounds for non-properness and factorization are foundational for applications in expert systems, network reliability, and multi-agent reasoning with partial information (Xu et al., 2013, Kerkvliet et al., 2015, Matuszewski et al., 2018, Zhou et al., 2016, Kłopotek, 2020, Smets, 2013, Cuzzolin, 2021, Arieli et al., 2023, Laskey, 2013, Black et al., 2013, Matuszewski et al., 2017, Fagin et al., 2013).