Markov Categories: A Categorical Framework

Updated 4 March 2026

Markov categories are symmetric monoidal categories where every object carries a cocommutative comonoid structure (copy and discard maps) that encode probabilistic independence.
They unify classical probability, statistical inference, and quantum models by formalizing conditional probability, Bayesian inversion, and entropy in an algebraic framework.
Key examples include categories like FinStoch and BorelStoch, and extensions that support infinite products, Kolmogorov extension, and categorical definitions of information theory.

A Markov category is an abstract, symmetric monoidal category in which each object is equipped with a cocommutative comonoid structure—copy and discard maps—that collectively encode the categorical structure of probabilistic independence and stochastic processes. Markov categories provide a unifying categorical foundation for probability theory, statistics, information theory, causal inference, and their generalizations, encompassing both classical and quantum settings. This framework supports the formalization of conditional probability, Bayesian inversion, Kolmogorov extension, entropy, and infinite product constructions in a purely algebraic and string-diagrammatic language.

1. Algebraic Foundations and Axioms

A Markov category $\mathcal{C}$ is a symmetric monoidal category $(\mathcal{C}, \otimes, I)$ with, for each object $X$ , specified morphisms:

Copy (comultiplication): $\Delta_X : X \to X \otimes X$
Discard (counit): $!_X : X \to I$

These satisfy the following cocommutative comonoid axioms:

Coassociativity: $(\Delta_X \otimes \mathrm{id}_X) \circ \Delta_X = (\mathrm{id}_X \otimes \Delta_X) \circ \Delta_X$
Counitality: $(!_X \otimes \mathrm{id}_X) \circ \Delta_X = \mathrm{id}_X = (\mathrm{id}_X \otimes !_X) \circ \Delta_X$
Cocommutativity: $\Delta_X = \sigma_{X,X} \circ \Delta_X$ , with symmetry $\sigma$
Monoidal coherence: $\Delta_{X \otimes Y} = (\Delta_X \otimes \Delta_Y)$ (plus standard associativity/braiding compatibilities)
Causality (or affineness): $!_Y \circ f = !_X$ for every $f : X \to Y$ , requiring normalization/totality of probability mass

A morphism $f: X \to Y$ is deterministic if it commutes with copying: $\Delta_Y \circ f = (f \otimes f) \circ \Delta_X$ .

These structures generalize and abstract the usual probabilistic semantics:

Objects are sample spaces
Morphisms are stochastic kernels (probability-preserving); in many examples these are measurable Markov kernels or stochastic matrices
Copy represents duplication of variables; discard represents marginalization

Markov categories specialize gs-monoidal (CD) categories, restricting to cases where the monoidal unit $I$ is terminal and discard is natural (Stein, 4 Mar 2025, Fritz et al., 2023, Fritz et al., 2022, Perrone, 2022).

2. Key Examples

Category	Objects	Morphisms	Copy/Discard Structure
$\mathrm{FinStoch}$	Finite sets	Stochastic matrices	Deterministic copy/total-discard
$\mathrm{BorelStoch}$	Standard Borel spaces	Markov kernels	Dirac-diagonal copy, unique discard
$\mathrm{Gauss}$	$\mathbb{R}^n$	Affine maps + Gaussian noise	Copy is direct sum, discard zeros
$\mathrm{SetMulti}$	Sets	Left-total relations	Diagonal relation, trivial discard
$\mathrm{Kl}(D)$	Sets	Kleisli of distribution monads	Inherited from base

Kleisli categories for affine, commutative monads on cartesian categories provide canonical Markov categories (e.g., the category of distributions, including Giry or finite-support probability monads) (Fritz et al., 2020, Fritz et al., 2023).

Quantum Markov categories generalize via additional involutive structure, supporting completely positive unital maps between pre- $C^*$ -algebras, enabling a unified treatment of classical and quantum probabilistic models (Fritz et al., 2023, Parzygnat, 2020).

3. Conditional Probability, Conditionals, and Bayesian Inversion

A central structure in Markov categories is the existence and uniqueness of conditionals, abstracting regular conditional probabilities: Given $f: X \to Y_1 \otimes Y_2$ , there exist:

$m : X \to Y_1$ (“marginal”)
$c : Y_1 \otimes X \to Y_2$ (“conditional”) such that

$f = \bigl( X \xrightarrow{\Delta_X} X \otimes X \xrightarrow{m \otimes \mathrm{id}} Y_1 \otimes X \xrightarrow{c} Y_2 \bigr)$

(Lavore et al., 2023, Fritz et al., 2024, Yin, 2022).

Bayesian inversion is categorially realized via conditionals, recovering both classical Bayesian updating and its quantum analogs (Parzygnat, 2020, Comfort et al., 20 Dec 2025). The notion of conditionals underpins categorical versions of the Bayes filter, backward smoothing equations, and the abstract Blackwell–Sherman–Stein theorem (Fritz et al., 2024, Fritz et al., 2020).

A representable Markov category admits for every $X$ a distribution object $PX$ (right adjoint to the inclusion of deterministic morphisms), supporting sampling, push-forward, and abstract monadic probabilistic semantics (Fritz et al., 2020).

4. Infinite Products, Kolmogorov Extension, and Zero–One Laws

Markov categories admit the categorical formulation of infinite products and the Kolmogorov extension theorem:

Kolmogorov product: an infinite tensor (product) object $X_J$ (for $J$ an index set) such that each diagram of finite marginals $\pi_F : X_J \to X_F$ (for $F \subset J$ finite) commutes, with $\pi_F$ deterministic
Existence in $\mathrm{BorelStoch}$ recovers the classical theorem for standard Borel spaces

Zero–one laws are derived categorically:

Kolmogorov zero–one law: Any statistic invariant under finite permutations on i.i.d. variables in an infinite product Markov category is deterministic—recovers “eventual triviality”
Hewitt–Savage law: Any symmetric statistic under i.i.d. extension is deterministic (Fritz et al., 2019).

These laws apply not just in classical probability, but in algebraic and topological settings (e.g. commutative rings, poset monoids, topological hyperspaces).

5. Enrichments: Entropy, Divergence, and Information Theory

Markov categories can be Div-enriched: hom-sets are equipped with divergences $D(-\| -)$ (e.g., KL, Rényi, total variation), defining quantity-like functionals such as entropy and mutual information categorically (Perrone, 2022):

Categorical entropy of $f: X \to Y$ : $H_D(f) = D(\Delta_Y \circ f \; \| \; (f \otimes f) \circ \Delta_X)$
Shannon, Rényi, Tsallis, and Gini–Simpson entropies are special cases
Categorical mutual information: For joint $h: A \to X \otimes Y$ , $I_D(h) = D(h \| (h_X \otimes h_Y) \circ \Delta_A)$
Data-processing inequality (DPI) holds at the abstract level due to non-expansiveness of composition and tensor product
Spectral and information-geometric interpretations of learning objectives in LLMs and machine learning are formulated via Markov categories and entropy (Zhang, 25 Jul 2025)

This formalism recovers standard information-theoretic quantities, generalized entropies, and generalized data-processing inequalities abstractly.

6. Extensions: Partial, Weakly, and Quantum Markov Categories

Partial Markov Categories: Relax naturality of discard, admitting partial (possibly non-normalized) processes, supporting exact algebraic conditioning, and formalizations of partial probability theory and evidential decision theory (Lavore et al., 2023, Mohammed, 5 Sep 2025).
**Weak