Sandwiched Rényi Divergences

Updated 6 September 2025

Sandwiched Rényi divergence is a quantum generalization of classical Rényi divergence that rigorously quantifies the distinguishability between density matrices.
It satisfies key properties such as positivity, data processing inequality, and monotonicity in α, enabling its use in quantum hypothesis testing and determining channel capacities.
Advanced techniques like Hölder’s inequality, Riesz–Thorin interpolation, and Sion’s minimax theorem underpin its formulation and extension to non-commutative and infinite-dimensional settings.

The sandwiched Rényi divergence is a non-commutative generalization of the classical Rényi divergence, designed to capture distinguishability between quantum states (density matrices) in ways that are both mathematically rigorous and operationally meaningful for quantum information theory. Independently introduced by Wilde et al. and Müller‐Lennert et al. in 2013, the sandwiched Rényi divergence unifies various quantum entropic quantities—including the min-, max-, and quantum relative entropies—and, crucially, satisfies data processing inequalities essential for analyses of quantum channel capacities, strong converse exponents, and operational tasks in hypothesis testing.

1. Formal Definition and Core Properties

Given quantum states (density operators) $\rho$ and $\sigma$ on a finite-dimensional Hilbert space, the sandwiched Rényi divergence of order $\alpha>0$ (and $\alpha \ne 1$ ) is defined as:

$D_{\alpha}(\rho\|\sigma) = \frac{1}{\alpha - 1} \log \operatorname{tr}\left[ \left( \sigma^{\frac{1-\alpha}{2\alpha}} \, \rho \, \sigma^{\frac{1-\alpha}{2\alpha}} \right)^\alpha \right],$

when $\operatorname{supp}(\rho) \subseteq \operatorname{supp}(\sigma)$ . For $\alpha>1$ , the divergence is set to $+\infty$ otherwise.

Key properties established in (Beigi, 2013):

Positivity and Faithfulness: $D_\alpha(\rho\|\sigma) \ge 0$ with equality if and only if $\rho = \sigma$ for all $\alpha > 0$ , $\alpha \neq 1$ .
Data Processing Inequality (DPI): For any completely positive trace-preserving (CPTP) map $\Phi$ , $D_{\alpha}(\rho\|\sigma) \ge D_{\alpha}(\Phi(\rho)\|\Phi(\sigma))$ holds for all $\alpha > 1$ ((Beigi, 2013), Theorem 6), and more generally for $\alpha \geq 1/2$ via alternative proofs.
Monotonicity in $\alpha$ : $D_\alpha(\rho\|\sigma)$ is strictly increasing in $\alpha$ for $\alpha > 1$ ((Beigi, 2013), Theorem 7).
Reduction to Classical Case: If $\rho$ and $\sigma$ commute, $D_\alpha(\rho\|\sigma)$ collapses to the classical Rényi divergence.

Additionally, in the limit $\alpha \rightarrow 1$ , the sandwiched Rényi divergence specializes to the quantum relative entropy (also known as Umegaki's relative entropy). For $\alpha=1/2$ and $\alpha\to\infty$ , it reduces to the min- and max-relative entropies, respectively (Datta et al., 2013).

2. Mathematical Foundations: Interpolation, Hölder, and Minimax

The proofs of the principal properties rely on an overview of advanced mathematical techniques:

Hölder's Inequality: The positivity and strictness result from applying generalized quantum Hölder inequalities to Schatten norms.
Riesz–Thorin Interpolation: The data processing inequality hinges on establishing contractivity properties via complex interpolation theory, interpolating between the endpoints of the non-commutative $L_p$ -spaces ((Beigi, 2013), Theorem 4).
Sion's Minimax Theorem: The minimax structure underlying the duality properties of quantum conditional Rényi entropy is resolved using Sion's minimax theorem, leveraging convexity–concavity of operator expressions.

These techniques also facilitate extensions of the main results to von Neumann algebras and infinite-dimensional settings via weighted non-commutative $L_p$ -norms (Berta et al., 2016, Jencova, 2016).

3. Data Processing and Equality Conditions

For $\alpha\geq 1/2$ , the sandwiched Rényi divergence is monotonic under all CPTP quantum operations. The condition for equality in DPI is algebraic: for quantum operation $\Lambda$ ,

$\sigma^{\gamma}(\sigma^{\gamma}\rho\sigma^{\gamma})^{\alpha-1}\sigma^{\gamma} = \Lambda^{\dagger}\left(\Lambda(\sigma)^{\gamma}(\Lambda(\sigma)^{\gamma}\Lambda(\rho)\Lambda(\sigma)^{\gamma})^{\alpha-1}\Lambda(\sigma)^{\gamma}\right), \;\; \gamma = \frac{1-\alpha}{2\alpha}$

((Leditzky et al., 2016), Theorem 2.1). For partial trace, this reduces to an invariance property involving the marginals and their extensions.

Applications derived from this equality condition include:

Rényi Version of the Araki–Lieb Inequality: Providing sharp statements for conditional entropies and entanglement measures.
Characterization of Recovery Maps: If DPI is saturated for $D_\alpha$ , the original state can be recovered via the Petz recovery map (Wang et al., 2020).

4. Operational Applications in Quantum Information Theory

Hypothesis Testing and Strong Converse Exponents

Sandwiched Rényi divergence governs the tradeoff between type I and type II errors in quantum hypothesis testing in the strong converse regime (Hiai et al., 2021). For state discrimination between $\rho$ and $\sigma$ , the strong converse exponent is given by the Legendre–Fenchel transform involving $D_\alpha(\rho\|\sigma)$ for $\alpha>1$ (Mosonyi, 2021, Li et al., 2022).

Quantum Channel Capacities and Holevo Information

Strong Converse for Channel Capacity: Applying $D_\alpha$ for $\alpha>1$ , one derives strong converse bounds for channel capacities (e.g., entanglement-breaking channels) (Beigi, 2013).
α–Holevo Information: Defined as $I_\alpha(A:B)=\min_{\sigma_B} D_\alpha(\rho_{AB}\|\rho_A\otimes\sigma_B)$ ; shown to be super-additive, so $I_\alpha(\Phi\otimes\Psi)\geq I_\alpha(\Phi)+I_\alpha(\Psi)$ .

Dynamical Convergence and Mixing Times

In quantum Markov semigroups and primitive quantum evolutions, the decay of the sandwiched Rényi divergence provides sharp exponential mixing bounds:

$D_p(e^{tL}(\rho)\|\sigma) \leq e^{-2\beta_p(L)t} D_p(\rho\|\sigma),$

where $\beta_p(L)$ is an entropic convergence constant related to the generator's logarithmic Sobolev constant (Müller-Hermes et al., 2016).

Learning Theory and Generalization Error

In quantum learning, upper bounds on generalization error are now obtained using the sandwiched Rényi divergence and its modified forms, providing bounds tighter than classical counterparts and the Petz divergence (Warsi et al., 16 May 2025).

5. Generalizations: Non-commutative and Infinite-dimensional Settings

Formulations based on Araki–Masuda and Kosaki's non-commutative $L_p$ spaces allow extension of sandwiched Rényi divergences to arbitrary von Neumann algebras (Berta et al., 2016, Jencova, 2016, Jenčová, 2017), ensuring data processing and continuity properties are preserved even outside the finite-dimensional setting.

The divergence admits a variational representation, and via finite–dimensional approximation, inherits its operational significance (e.g., for strong converse exponents and measured Rényi divergences) in infinite-dimensional systems (Hiai et al., 2021).

6. Continuity, Stability, and Interpolation Theory

Recent works provide uniform continuity bounds for sandwiched Rényi quantities (divergence, conditional entropy, mutual information) using approaches based on (almost) additivity, operator space norms, and complex interpolation theory (Bluhm et al., 2023). These continuity results are essential for resource theories, stability of approximate quantum Markov chains, and robustness analysis.

In the convex interpolation approach, explicit norm formulas (such as $\|X\|_{(\mathcal{C}, p', q')}$ ) enable sharp bounds even in high dimension and for perturbations in the underlying quantum states.

7. Open Directions and Further Implications

Although sandwiched Rényi divergence and its extensions now serve as central tools for quantum information theory, ongoing research addresses:

Extensions to broader classes of quantum operations (beyond CPTP maps).
Tighter continuity results in infinite–dimensional settings (Bluhm et al., 2023).
Further refinements of strong converse rates in composite hypothesis testing (Burri, 5 Jun 2024).
Analytical and operational understanding of the behavior for $\alpha<1$ , e.g., connection to privacy amplification, decoupling, and smoothing protocols (Li et al., 2022).

Summary Table: Distinguishing Features

Property/Context	Sandwiched Rényi Divergence ( $D_\alpha$ )	Reference
Data processing	Holds for $\alpha \geq 1/2$	(Beigi, 2013)
Monotonicity in $\alpha$	Strictly increasing for $\alpha>1$	(Beigi, 2013)
Additivity (α-Holevo)	Super-additive	(Beigi, 2013)
Strong converse rate	Characterizes optimal exponents in hypothesis testing	(Hiai et al., 2021)
Extension to von Neumann	Yes, via non-commutative $L_p$ -spaces	(Berta et al., 2016)
Variational formula	Exists and central to operational analysis	(Jencova, 2016)
Operational meaning	State discrimination, channel capacity, learning error	(Mosonyi, 2021, Warsi et al., 16 May 2025)

The sandwiched Rényi divergence thus provides a unifying framework for quantum distinguishability, generalizes classical concepts, and is a fundamental ingredient in the quantitative analysis of quantum information processing.