Bisimulation and Causal States
- Bisimulation and causal states are foundational concepts that formalize behavioral equivalence and minimal information sufficiency for accurate prediction and control.
- They underpin methodologies in reinforcement learning, process calculi, and Petri nets, facilitating model abstraction and system verification.
- Recent advances employ quantitative bisimulation metrics to enhance sample efficiency in RL and improve robustness in partially observable settings.
Bisimulation and causal states are central concepts in the formal theory of dynamical systems, concurrency, automata, and reinforcement learning. Bisimulation provides a mathematical framework for behavioral equivalence in both discrete and continuous settings, while causal states formalize what information is necessary and sufficient for prediction and control. The two notions are deeply connected: causal states are typically the finest (minimal) bisimulation partition of histories, and bisimulation metrics quantify proximity between representations that differ in causally irrelevant details. These concepts underpin algorithms for abstraction, verification, and learning across disciplines including partial-observability Markov decision processes (POMDPs), process calculi, Petri nets, and effectful automata.
1. Formal Definitions: Causal States and Bisimulation
In partially observable environments, such as POMDPs or perturbed POMDPs (P²OMDPs), the minimal sufficient statistic for future prediction is given by the causal state. Formally, let denote the observation–action history up to time . Causal equivalence identifies iff their conditional distributions over all future observations and actions coincide: The corresponding causal state is the equivalence class (Zhang et al., 2019).
Bisimulation, in the reinforcement learning tradition, is an equivalence relation on states such that implies:
- for all actions
- for every equivalence class .
For continuous state spaces, the bisimulation metric (the least fixed point of the Ferns–Panangaden–Precup operator) satisfies: where is the Wasserstein distance induced by (Zhang et al., 2019, Li et al., 29 Nov 2025).
In event-structure and concurrency theory, bisimulation is defined coinductively over labeled transition systems or operational models (Petri nets, π-calculus, etc.), with explicit preservation of the causal order of events and concurrency structure (Gorrieri, 2022, Aubert et al., 2022).
2. Theoretical Connection: Causal States as Bisimulation Classes
In both RL and concurrency, the set of causal states coincides with the coarsest (finest-resolution) bisimulation partition sufficient for future prediction. In history-based MDPs, the causal-state partition is itself a bisimulation (Zhang et al., 2019). In the theory of Petri nets, causal bisimulation over "causal case graphs" yields equivalence classes that encode the entire causal history needed for further behavior. These classes are the so-called "causal states" (Bruni et al., 2015). In process calculi, causal-state representations are obtained as configurations (downward-closed, conflict-free sets) of events in an event-structure, with bisimulation preserving the causal partial order of events (Aubert et al., 2022).
In “True Concurrency Can Be Easy,” the equivalence class of a marking under step-net bisimulation is a causal state: all markings with the same unordered, partial-order history of events up to concurrency and conflict are identified. This extends to structure-preserving and causal-net bisimilarity (Gorrieri, 2022).
3. Quantitative Bisimulation Metrics and Value Approximation
Several works extend the qualitative notion of bisimulation to a quantitative metric, particularly in RL and learning theory, to reason about continuous representations and approximation. The bisimulation metric is the canonical behavioral pseudometric, with continuity properties: where is the value function (Zhang et al., 2019). In recent advances, such as CaDiff (Li et al., 29 Nov 2025), a new bisimulation distance is introduced, leveraging the -Wasserstein metric between observable and denoised (causal) states: This framework provides explicit error bounds on value-function approximation, decomposing contributions from clustering, reward and transition modeling, and the bisimulation metric learning (Li et al., 29 Nov 2025).
4. Methodologies: Coalgebraic, Categorical, and Diffusion-Based Approaches
Coalgebraic approaches (as in Bruni–Montanari–Sammartino (Bruni et al., 2015)) recast causal bisimulation in the context of presheaf categories over posets, making causal dependencies explicit and enabling abstract minimization using symmetries. In category theory, effectful Mealy machines are captured by a state, initial state, and transition morphism, and bisimulation is characterized both syntactically (via uniform feedback extension) and coalgebraically (via spans of homomorphisms) (Bonchi et al., 2024).
In contemporary learning paradigms, CaDiff introduces asynchronous diffusion models (ADM) to denoise perturbed observation sequences, where forward Ornstein-Uhlenbeck (OU) noise is reversed by score-based networks, and the denoised representations are regularized by a bisimulation metric. This methodology is the first to provide both theoretical guarantees and practical gains for extracting causal states in P²OMDPs (Li et al., 29 Nov 2025).
5. Causal States and Bisimulation in Concurrency and Process Theory
Petri nets, event structures, and process calculi formalize non-interleaving or "true concurrency" systems. Step-net bisimulation, causal-net bisimulation, and history-preserving bisimulation progressively refine the granularity of causal-state tracking:
- Step-net bisimilarity: matches sets of concurrently enabled transitions; equivalence = identical concurrency/conflict structure (Gorrieri, 2022).
- Causal-net and structure-preserving bisimulation: maintain bijections on events and tokens, ensuring preservation of causal dependencies and identification of causal states.
- History-preserving (HP) bisimulation: tracks not only event duration (ST-bisimilarity) but also the exact causal order among events in asynchronous transition systems (Aubert et al., 2022).
These frameworks reveal that bisimulation equivalence classes encode precisely the causal state relevant for predicting or simulating system behavior.
6. Applications, Empirical Results, and Hierarchies
Empirical evaluation of causal-state representations in RL demonstrates substantial gains in sample efficiency and robustness to partial observability and perturbations. In CaDiff, denoising with ADM and enforcing bisimulation regularity outperforms both diffusion-only and bisimulation-only algorithms by at least 14.18% across diverse Roboschool tasks (Li et al., 29 Nov 2025). Theoretical analysis in (Zhang et al., 2019) provides provable upper and lower bounds on suboptimality in planning with learned causal-state embeddings.
In concurrency theory, the hierarchy of bisimulations: maps levels of causal-structural precision needed for system analysis, specification, and attacker modeling.
7. Summary Table: Bisimulation and Causal State Approaches
| Framework | Causal State Construction | Bisimulation Criterion |
|---|---|---|
| RL/POMDPs (Zhang et al., 2019, Li et al., 29 Nov 2025) | Coarsest history partition that predicts future | Reward + transition consistency; possibly metric-based |
| Petri nets (Gorrieri, 2022, Bruni et al., 2015) | Markings/events up to concurrent causality | Step/causal/structure bisimilarity |
| π-calculus (Aubert et al., 2022) | Pending event-sets in LATS/event structures | ST-bisimilarity, HP-bisimilarity |
| Categorical (Bonchi et al., 2024) | State/transition as categorical coalgebras | Uniform feedback span bisimulation |
The unifying theme is that causal states provide the minimal granularity at which bisimulation must be enforced to fully capture the system's behavioral information; bisimulation metrics and coalgebraic/categorical constructions enable computation, minimization, and quantitative analysis in both classical and learning-theoretic settings.