Axiomatic Entropy & Diversity

Updated 9 June 2026

Entropy and diversity are measures that quantify uncertainty and variety, defined via key axioms such as additivity, normalization, and symmetry.
The axiomatic framework classifies classic forms like Shannon, Tsallis, and Rényi entropies through minimal assumptions and scaling laws.
Extensions incorporating state dissimilarities enable practical applications in ecology, network analysis, and the study of non-ergodic systems.

Entropy and diversity, central concepts in both information theory and ecology, possess deep mathematical connections. The axiomatic approach provides a principled basis for quantifying these notions, clarifying both their mathematical structure and their applicability to complex or non-ergodic systems. This article surveys the major strands of the axiomatic theory of entropy and diversity, emphasizing their formal correspondences, classification schemes, extensions to generalized trace-form entropies, and implications for diversity measurement.

1. Minimal Axioms for Entropy and Diversity

Classical entropy is defined as a real-valued map on the probability simplex $P(n) = \{ p = (p_1,\dots,p_n) \mid p_i \geq 0,\, \sum_i p_i = 1 \}$ . The standard axiomatic scheme isolates three essential properties (Gour et al., 2020):

Schur-concavity (Monotonicity under mixing): If $p \succ p'$ (i.e., $p'$ is more "mixed" than $p$ ), then $H(p) \leq H(p')$ .
Additivity: For independent distributions $p \in P(n)$ , $q \in P(m)$ , $H(p \otimes q) = H(p) + H(q)$ .
Normalization: $H(u_2) = \log 2$ , where $u_k$ is the uniform distribution on $p \succ p'$ 0 outcomes.

These yield Shannon entropy $p \succ p'$ 1 uniquely, up to scale and choice of logarithm base.

For diversity measures $p \succ p'$ 2, similar axioms are employed to ensure invariance under relabeling (symmetry), continuity, effective-number normalization ( $p \succ p'$ 3), and a modular or replication property (Leinster, 2020). The diversity index then corresponds (via exponentiation) to the associated entropy, $p \succ p'$ 4, and more generally to Hill numbers:

$p \succ p'$ 5

where $p \succ p'$ 6 and $p \succ p'$ 7 recovers the Shannon case.

2. The Shannon–Khinchin Axiomatic Scheme and Extensions

The classical Shannon–Khinchin (SK) axioms (continuity, maximality at uniform, expansibility, and separability/additivity) uniquely determine the Boltzmann–Gibbs (Shannon) entropy (Leinster, 2020, Tempesta, 2014). For probability vectors $p \succ p'$ 8 on $p \succ p'$ 9 outcomes:

Continuity (SK1): $p'$ 0 is continuous in $p'$ 1.
Maximality (SK2): Maximum at the uniform distribution.
Expandability (SK3): $p'$ 2.
Separability/Additivity (SK4): $p'$ 3 for independent systems.

Relaxing SK4 (recursivity), as is natural in non-ergodic or strongly interacting systems, leads to generalized trace-form entropies not strictly additive (Thurner et al., 2011). In this generalization, entropic forms are classified by asymptotic scaling exponents rather than strict functional equations.

3. Generalized Trace-Form Entropy and Scaling Laws

The broad class of generalized trace-form entropies, $p'$ 4 with $p'$ 5 continuous and concave, is classified by two asymptotic scaling exponents $p'$ 6 and $p'$ 7:

Primary exponent $p'$ 8: Controls power-law scaling as $p'$ 9 as $p$ 0.
Secondary exponent $p$ 1: Arises from $p$ 2; it further refines the asymptotic class.

The canonical entropy $p$ 3 for each equivalence class $p$ 4 takes the form (Thurner et al., 2011):

$p$ 5

where $p$ 6 is the incomplete gamma function.

Special cases include:

Entropic Family	$p$ 7 form	Scaling Exponents $p$ 8
Boltzmann–Gibbs	$p$ 9	$H(p) \leq H(p')$ 0
Tsallis	$H(p) \leq H(p')$ 1	$H(p) \leq H(p')$ 2
Stretched-Exponential	$H(p) \leq H(p')$ 3	$H(p) \leq H(p')$ 4

Such a scheme accommodates both classical and nonadditive entropies, relevant for complex systems, superstatistics, and non-ergodic contexts.

4. Relative Entropy, Divergence Measures, and Faithfulness

Relative entropy (information divergence) is axiomatized via the following (Gour et al., 2020):

Data-processing inequality (B1): $H(p) \leq H(p')$ 5 under stochastic maps.
Additivity (B2): $H(p) \leq H(p')$ 6.
Normalization (B3): $H(p) \leq H(p')$ 7.

These define Kullback–Leibler divergence, $H(p) \leq H(p')$ 8, as unique among divergence functions satisfying these properties and continuity on the appropriate domain.

Sharp bounds on relative entropy are given by the min- and max-Rényi divergences:

$H(p) \leq H(p')$ 9

Every normalized relative entropy $p \in P(n)$ 0 satisfies $p \in P(n)$ 1. Faithfulness— $p \in P(n)$ 2 implies $p \in P(n)$ 3—fails only for $p \in P(n)$ 4 ( $p \in P(n)$ 5), while all $p \in P(n)$ 6 cases, including Kullback–Leibler and Rényi divergences, are faithful.

There is a bijective correspondence between entropies and relative entropies: for any entropy $p \in P(n)$ 7, $p \in P(n)$ 8 under an appropriate extension, and conversely, $p \in P(n)$ 9 (Gour et al., 2020).

5. Categorical and Composability Frameworks

Category-theoretic approaches axiomatize entropy and relative entropy using enriched symmetric monoidal categories (SMCs) (Sarkis et al., 4 Mar 2026). In this formalism, KL and Rényi divergences are characterized by chain-rule axioms and monotonicity under stochastic maps. The structure makes evident two composition operations (Kronecker product for parallel and direct sum for choice), each enabling full diagrammatic completeness results. Shannon entropy and Hill diversities emerge in this context as unique invariants—distances to the uniform distribution—ensuring the replicative and compositional principles in both the information-theoretic and ecological settings.

The group-theoretical framework (universal-group entropy) further generalizes entropy via formal group laws, capturing composability, symmetry, and associativity in a power-series law $q \in P(m)$ 0 (Tempesta, 2014). All major classical and generalized entropies become specific cases under this approach, and the induced diversity indices inherit the group composability property—generalizing the replication principle.

6. Diversity with State Dissimilarities and Affinity Extensions

Extensions of diversity indices to include pairwise dissimilarities between states (species) have been established via axiomatic schemes (Okamura, 2018). For a symmetric dissimilarity matrix $q \in P(m)$ 1 and abundance vector $q \in P(m)$ 2, effective diversity is constructed as:

$q \in P(m)$ 3

where $q \in P(m)$ 4 is a suitable kernel with $q \in P(m)$ 5.

A key result is the Nesting Principle: for generalized diversity to remain invariant under grouping, $q \in P(m)$ 6 is forced unless affinities vanish. In the limit $q \in P(m)$ 7, or $q \in P(m)$ 8 (zero affinity), one recovers the classic Hill numbers and Boltzmann–Gibbs entropy.

Affinity-based extensions have profound consequences for both canonical distributions in statistical physics (equilibrium distributions are no longer uniform in general) and alpha-beta-gamma partitioning of biodiversity (Okamura, 2018).

7. Applications and Unifying Maximum Entropy Principles

In biodiversity measurement, the axiomatic approach underpins the existence and uniqueness of effective number-based diversity indices, including all Hill, Rényi, and Tsallis classes (Leinster, 2020, 0910.0906). Hill numbers unify metrics such as species richness, Shannon diversity, and Simpson diversity, ensuring consistent interpretations under species grouping and partitioning. Uniqueness theorems guarantee these indices are the only (continuous, symmetric, effective-number, compositional) measures compatible with the foundational axioms.

In the generalized similarity-diversity framework (0910.0906), a single abundance distribution maximizes all order- $q \in P(m)$ 9 diversities for a given similarity matrix $H(p \otimes q) = H(p) + H(q)$ 0, establishing a robust "maximum diversity principle" with direct relevance to the identification of optimal community structures in ecology, conservation, and even network analysis.

Table: Key Families of Entropy/Diversity Measures and Their Classification

Entropy/Diversity	Axiomatization	Trace-form parameterization	Maximum entropy distribution	Reference
Shannon/Boltzmann–Gibbs	Full SK axioms (additivity)	$H(p \otimes q) = H(p) + H(q)$ 1, $H(p \otimes q) = H(p) + H(q)$ 2	$H(p \otimes q) = H(p) + H(q)$ 3	(Thurner et al., 2011)
Tsallis	Weak additivity/pseudo-additivity	$H(p \otimes q) = H(p) + H(q)$ 4, $H(p \otimes q) = H(p) + H(q)$ 5	$H(p \otimes q) = H(p) + H(q)$ 6-exponential	(Thurner et al., 2011)
Rényi/Hill	Weak chain rule, effective number	$H(p \otimes q) = H(p) + H(q)$ 7 parameter	Power mean form	(Leinster, 2020)
Group/Universal	Formal-group composability	Power series in $H(p \otimes q) = H(p) + H(q)$ 8	General group exponential/logarithm	(Tempesta, 2014)
Affinity-based	Dissimilarity axioms, Nesting	$H(p \otimes q) = H(p) + H(q)$ 9	Depends on $H(u_2) = \log 2$ 0, and $H(u_2) = \log 2$ 1	(Okamura, 2018)

Effective diversity measures thus arise as exponentials or power means of trace-form entropy functionals, classified by their scaling or compositional properties and interaction with state dissimilarities. The axiomatic approach ensures internal consistency, maximal generality, and reveals the structural connections between entropy and diversity across physical, informational, and ecological systems.