Mutual Information Renormalizability

Updated 14 November 2025

The paper introduces an MI-based framework that quantifies renormalizability by ensuring only a finite number of degrees of freedom carry nontrivial long-range information.
It applies variational principles and neural network estimators to optimize the coarse-graining map and maximize mutual information between microscopic and macroscopic scales.
The approach unifies renormalization insights across classical, quantum, equilibrium, and non-equilibrium systems by diagnosing RG flows through MI saturation.

A mutual information‐based measure of renormalizability provides a quantitative and universal framework for characterizing the retention of large‐scale or long‐distance information under renormalization group (RG) transformations. These measures arise naturally from information theory and are applicable across classical, quantum, equilibrium, and non‐equilibrium systems. The essential idea is that renormalizability—traditionally defined in terms of the ability to absorb divergences into a finite set of couplings—is operationally equivalent to the requirement that only a finite number of relevant degrees of freedom retain nontrivial mutual information across scales. Formalizations based on mutual information (MI) render this notion both rigorous and broadly applicable.

1. Information-Theoretic Motivation and Definition

A central observation is that the RG, in any incarnation, may be regarded as a controlled process of information loss or compression: microscopic details are integrated out, retaining only those degrees of freedom relevant to macroscopic behavior. The mutual information between the coarse-grained ("retained") degrees of freedom and their environment, or between modes at different scales (e.g., adjacent momentum shells), quantifies the amount of information preserved about the long-range, large-scale structure.

For a generic (classical or quantum) system, denote $X$ as the microscopic configuration, $H$ as coarse variables (e.g., block spins), and $E$ as the environment (Koch-Janusz et al., 2017, Lenggenhager et al., 2018, Gökmen et al., 2021). The real-space mutual information (RSMI) between $H$ and $E$ is

$I_\Lambda(H:E) = \sum_{h,e} P_\Lambda(h,e)\log\frac{P_\Lambda(h,e)}{P_\Lambda(h)P(e)}$

where $P_\Lambda(h|V)$ is the RG coarse-graining map parameterized by $\Lambda$ , and $V$ is the visible patch from which $H$ is constructed. In momentum-space, for a quantum field theory (QFT) with field modes grouped into shells $A$ and $B$ , MI is defined as

$I_{A:B} = S_A + S_B - S_{A\cup B}$

with $S_X = -\mathrm{Tr}_X(\rho_X\ln\rho_X)$ , and $\rho_X$ the reduced density matrix on $X$ (Bowen et al., 12 Nov 2025).

2. Variational Principle: Maximizing Mutual Information

The mutual information‐based measure of renormalizability typically assumes the form of a variational principle: the optimal coarse-graining is that which maximizes the MI between the retained block-degrees of freedom and the environment (Lenggenhager et al., 2018, Koch-Janusz et al., 2017, Gökmen et al., 2021). Concretely,

$\Lambda^* = \underset{\Lambda}{\mathrm{argmax}}\ I_\Lambda(H:E)$

The maximization ensures preservation of those combinations of degrees of freedom best encoding long-range (or low-energy) correlations. In practice, parametrizations via restricted Boltzmann machines (RBMs) or neural networks are employed for $P_\Lambda(H|V)$ , and variational bounds or lower estimates (such as InfoNCE) are used to optimize MI efficiently (Koch-Janusz et al., 2017, Gökmen et al., 2021).

This framework has concrete operational meaning: when $I_\Lambda(H:E)$ saturates—in the sense that further enlargement of $H$ does not increase MI—all relevant degrees of freedom have been retained; additional variables will only encode local noise or redundant information (Koch-Janusz et al., 2017, Lenggenhager et al., 2018).

3. Mutual Information as a Diagnostic of Renormalizability

The value and structure of mutual information under RG directly reflect the theory's renormalizability. Several universal principles emerge:

Saturation and Relevance: If $I_\Lambda(H:E)$ reaches a plateau as $H$ augments, only a finite set of coarse variables encode relevant, long-range information—i.e., the system is renormalizable.
Decline of Complexity: For models with Hamiltonians $H$ and disorder distributions $\mathcal{P}(\boldsymbol{K})$ , maximization of RSMI prevents the generation of long-range or high-body couplings in the renormalized Hamiltonian. The exponential decay of “rangeness” and “ $m$ -bodyness” as $I_\Lambda(H:E)$ increases quantifies this suppression (Lenggenhager et al., 2018).
Disorder Independence: In disordered systems, a "perfect" RSMI coarse-graining (i.e., $I(H_0:E_0) = I(V_0:E_0)$ for blocks $V_0$ ) suppresses the appearance of new disorder correlations under RG flow (Lenggenhager et al., 2018).
Markovianity: When MI saturates at the coarse-graining scale, the effective Hamiltonian is strictly short-ranged: there are no direct couplings skipping over a block (Koch-Janusz et al., 2017).

The MI spectrum associated with the Fisher-information second variation around the optimal coarse-graining map encodes the scaling dimensions of emergent operators: leading (largest eigenvalue) directions correspond to relevant operators, with scaling of the associated eigenvalue $\lambda_n(L_{\mathcal B}) \sim L_{\mathcal B}^{-2\Delta_n}$ , where $L_{\mathcal B}$ is the buffer size and $\Delta_n$ is the scaling dimension (Gökmen et al., 2021).

4. Momentum-Space Mutual Information and RG Classification

A distinct but closely related approach employs the mutual information between infinitesimal momentum shells as a universal diagnostic for renormalizability in quantum field theory—and crucially, in both equilibrium and out-of-equilibrium settings (Bowen et al., 12 Nov 2025). For two shells around momenta $k$ and $k + \Delta$ , define

$\alpha(\Delta) = \frac{\partial}{\partial\Delta} \ln I(k, k+\Delta)$

At large momentum separation, the sign of $\alpha(\Delta)$ yields the conventional RG classification:

$\alpha < 0$ : super-renormalizable theory,
$\alpha = 0$ : (marginal) renormalizable,
$\alpha > 0$ : non-renormalizable.

This is a direct consequence of the engineering dimension of the coupling: asymptotically, $I_{k:k+r k}^{(\infty)} \propto r^{-2 [\lambda]}$ , so that $\alpha = -2[\lambda]$ . This result holds both in Minkowski spacetime and for conformally coupled fields on de Sitter; in all cases, the large- $\Delta$ tail of MI encodes the relevant/irrelevant classification of couplings in a regulator-independent manner (Bowen et al., 12 Nov 2025). Because the MI is constructed out of reduced density matrices for disjoint momentum shells, boundary divergences and regularization subtleties are avoided, enhancing universality.

5. Regularized and Renormalized Mutual Information: Coarse-Grainability

In the case of deterministic feature extraction for high-dimensional continuous variables, classical mutual information diverges. The concept of "renormalized mutual information" $\tilde I(x,y)$ provides a finite, reparametrization-invariant quantifier (Sarra et al., 2020): $\tilde I(x, y) = H(y) - \int dx\, P_x(x) \ln \sqrt{ \det [ \nabla f(x) \nabla f(x)^T ] }$ This measure quantifies the nontrivial, compressive content of a feature $y = f(x)$ about the high-dimensional input $x$ , after removing the trivial (deterministic map) contribution. Large $\tilde I$ signals that $y$ is a highly informative coarse-grain and, therefore, that $x$ is coarse-grainable—i.e., the system is renormalizable in terms of this collective variable.

Practical maximization of $\tilde I$ (e.g., using neural networks) has been demonstrated to discover collective variables in both synthetic and physically motivated systems, with distinct peaks in $\tilde I$ indicating well-aligned emergent variables and corresponding coarse-grainabilities. An important caveat is that $\tilde I$ is not symmetric in its arguments and can be negative when $y$ fails to capture any information about $x$ (Sarra et al., 2020).

6. Geometric and Entropic Measures: Entanglement and RG Monotones

Entanglement-based mutual information between spatial regions, particularly in quantum field theory, provides a geometric and universal regularization of long-known RG monotones (“c-functions,” “F-theorems”) (Swingle, 2010, Casini et al., 2015). For two adjacent or nearly coincident spatial regions $A, B$ , the mutual information

$I(A,B) = S(A) + S(B) - S(A \cup B)$

exhibits universal short-distance singularities: $I(x) \sim C_d\, x^{-\alpha}$ where $x$ is the separation and $C_d$ encodes entanglement per scale—directly related to central charges ( $c$ in $d=1$ , F-coefficient in $d=3$ ). Under RG flows, $C_d$ decreases from UV to IR, providing a monotonic, finite, and cutoff-independent measure of renormalizability (Swingle, 2010, Casini et al., 2015). The regulator-independent F-coefficient defined from mutual information of concentric circles is particularly robust for $d=3$ QFT (Casini et al., 2015).

7. Statistical Inference, Quantum Distinguishability, and Information Loss

Operationally, the central insight unifying all these approaches is that RG flow is the process of information loss under finite-resolution probes (Bény et al., 2014, Bény et al., 2013). Distinguishability metrics (relative entropy, Fisher information) determine the “relevant” directions: if only finitely many observables preserve significant mutual information at large coarse-graining (e.g., n-point functions of low momentum modes under a noisy channel), the theory is renormalizable. This construction is independent of microscopic details and provides an information-theoretic, observer-centered interpretation of RG (Bény et al., 2014, Bény et al., 2013).

Quantitative operationalizations include:

Distinguishability density: $d_\sigma(A) = \sigma^d \lim_{V \to \infty} \frac{1}{|V|}\, D(E_\sigma(\rho_{H+\epsilon A}) \| E_\sigma(\rho_H))/\epsilon^2$ , with $A$ relevant if $d_\sigma(A)\sim \sigma^\alpha$ , $\alpha>0$ (Bény et al., 2014).
Relevance spectra: Eigenvalues $\eta_n$ of the "relevance" operator give the principal compression ratios after coarse graining; only finitely many nonzero $\eta_n$ for renormalizable theories (Bény et al., 2013).

References Table

Aspect	Key Reference(s)	MI Formalism / Role
Variational RSMI Principle	(Koch-Janusz et al., 2017, Lenggenhager et al., 2018, Gökmen et al., 2021)	$\max_\Lambda I_\Lambda(H:E)$
Momentum-space MI/Classification	(Bowen et al., 12 Nov 2025)	$\alpha(\Delta)$ as RG diagnostic
Renormalized MI for Coarse-Grainability	(Sarra et al., 2020)	$\tilde I(x, y)$
Entanglement MI & Geo RG Monotones	(Swingle, 2010, Casini et al., 2015)	$I(A,B)$ , $C_d$ as c-function
Information loss & inference	(Bény et al., 2014, Bény et al., 2013)	Fisher metric, relevance spectra

Summary

Mutual information–based measures of renormalizability supply a nonperturbative, representation-agnostic framework for RG across a broad spectrum of systems. By recasting renormalization as the preservation (or loss) of information—quantified via MI—these approaches unify statistical, geometric, and operational RG notions, yield practical optimization schemes, and connect abstract field-theoretic anomalies directly to information-theoretic quantities. Their power and generality are demonstrated via neural-network implementations, momentum-space diagnostics, and universal c-functions in quantum field theory. Limitations may arise from the need for effective MI estimators at large system sizes, accurate modeling of the underlying probability distributions, and the extension to quantum or out-of-equilibrium settings, where path-integral sampling or time-dependent propagators are essential.

A plausible implication is that further refinements in MI-based estimators and their deployment in automated discovery pipelines may enable systematic theory-building in both well-understood and novel physical and stochastic systems.