Papers
Topics
Authors
Recent
Search
2000 character limit reached

Bernoulli f-Divergence Inequality

Updated 24 January 2026
  • Bernoulli f-Divergence Inequality is a framework defining sharp, explicit bounds linking f-divergences to total variation in Bernoulli distributions through convex generating functions.
  • Its methodology leverages reduction to two-point supports and precise extremal conditions, thereby generalizing classical results like Pinsker’s inequality to quantum contexts.
  • The inequality underpins applications in statistical decision making and information theory, offering actionable insights for hypothesis testing, risk minimization, and quantum divergence analysis.

The Bernoulli ff-divergence inequality provides sharp, explicit relations between various ff-divergences (of the Csiszár type) for Bernoulli distributions, frequently parameterized in terms of the total variation distance. These inequalities subsume and generalize classical results such as Pinsker’s, and form a kernel for both classical and quantum information theoretic bounds. The foundational results revolve around convexity properties of the generating function ff and leverage reduction arguments to two-point supports.

1. Definition and Principal Formulation

Let f:(0,)Rf:(0,\infty)\to\mathbb{R} be convex with f(1)=0f(1)=0. For probability measures PQP\ll Q, the ff-divergence is defined by

$D_f(P\|Q) = \int_{q>0} f\left(\frac{p}{q}\right) dQ + f'(\infty) P\{ q=0 \}$

where p=dP/dλp=dP/d\lambda, q=dQ/dλq=dQ/d\lambda under any dominating measure λ\lambda. For Bernoulli distributions P=Bern(p)P=\text{Bern}(p), Q=Bern(q)Q=\text{Bern}(q),

Df(Bern(p)Bern(q))=qf(pq)+(1q)f(1p1q)D_f(\text{Bern}(p)\|\text{Bern}(q)) = q\,f\left(\frac{p}{q}\right) + (1-q)\,f\left(\frac{1-p}{1-q}\right)

(Guntuboyina et al., 2013, 0903.1765, Bongole et al., 17 Jan 2026, Lanier et al., 24 Jan 2025).

2. Sharp Lower Bounds via Total Variation

The central inequalities relate Df(Bern(p)Bern(q))D_f(\text{Bern}(p)\|\text{Bern}(q)) to the total variation distance δ=pq\delta=|p-q|:

  • Bröcker’s monotonic lower bound (0903.1765):

Df(Bern(p)Bern(q))f(1+δ/2)+f(1δ/2)D_f(\text{Bern}(p)\|\text{Bern}(q)) \geq f(1+\delta/2) + f(1-\delta/2)

This is tight for Bernoulli variables. The bounding function is strictly increasing in δ\delta under mild regularity assumptions.

  • Sharp minimization via support reduction (Guntuboyina et al., 2013): For the minimum DfD_f at fixed pq=V|p-q|=V,

Df(Bern(p)Bern(q))(1V)f(1+V1V)D_f(\text{Bern}(p)\|\text{Bern}(q)) \geq (1-V) f\left(\frac{1+V}{1-V}\right)

attained when p=(1+V)/2p=(1+V)/2, q=(1V)/2q=(1-V)/2, i.e., at symmetric pairs.

3. Best-Possible Generalized Pinsker Inequalities

The framework in (0906.1244) gives integral representations and tight “Pinsker-type” lower bounds for arbitrary ff in terms of total variation: Df(Bern(p)Bern(q))Ψf(δ):=2[Γˉf(12δ2)+δ2Γf(12)Γˉf(12)]D_f(\text{Bern}(p)\|\text{Bern}(q)) \geq \Psi_f(\delta) := 2\left[\bar\Gamma_f\left(\frac{1}{2}-\frac{\delta}{2}\right) + \frac{\delta}{2} \Gamma_f\left(\frac{1}{2}\right) - \bar\Gamma_f\left(\frac{1}{2}\right) \right] where Γf(π)=0πγf(t)dt\Gamma_f(\pi) = \int_0^\pi \gamma_f(t)\,dt, Γˉf(π)=0πΓf(t)dt\bar\Gamma_f(\pi) = \int_0^\pi \Gamma_f(t)\,dt, and γf(π)=1π3f(1ππ)\gamma_f(\pi) = \frac{1}{\pi^3} f''(\frac{1-\pi}{\pi}) for twice-differentiable ff.

The minimizing, or extremal, Bernoulli pairs for fixed δ\delta have points at p=(1+δ)/2p=(1+\delta)/2, q=(1δ)/2q=(1-\delta)/2.

4. Explicit Algebraic and Sandwich Inequalities

The “binary ff-divergence inequality” (Lanier et al., 24 Jan 2025, Sason, 2015) provides sharp algebraic sandwich bounds between any two Bernoulli ff-divergences, with formulas involving ratios and the χ2\chi^2 divergence: mDf(PQ)pf(pq)+(1p)f(1p1q)f(1+(pq)2q(1q))MDf(PQ)m D_f(P\|Q) \leq p\,f\left(\frac{p}{q}\right) + (1-p)\,f\left(\frac{1-p}{1-q}\right) - f\left(1 + \frac{(p-q)^2}{q(1-q)}\right) \leq M D_f(P\|Q) where

m=min{pq,1p1q},M=max{pq,1p1q}m = \min\left\{ \frac{p}{q}, \frac{1-p}{1-q} \right\},\quad M = \max\left\{ \frac{p}{q}, \frac{1-p}{1-q} \right\}

and the total variation and χ2\chi^2 divergence are

δ=pq,χ2(P,Q)=δ2q(1q)\delta = |p-q|,\qquad \chi^2(P,Q) = \frac{\delta^2}{q(1-q)}

This inequality gives explicit control of the ff-divergence in terms of basic symmetric functions of pp and qq (Sason, 2015).

5. Optimality, Tightness, and Equality Conditions

The reductions above are maximally tight for Bernoulli laws. Tightness follows from the fact that the relevant functions (Bayes-risk curve, data processing contractions, etc.) achieve their extrema for binary distributions. Equality is attained precisely when dP/dQdP/dQ takes only two values and ff is affine over the critical support points involved in the inequalities.

Cases of equality in the sandwich bound occur only in degenerate cases (i.e., p=qp=q or ff affine) or for the aforementioned symmetric extremal pairs.

6. Instantiations and Special Cases

The Bernoulli ff-divergence inequalities specialize to classical divergences:

ff function DfD_f expression Lower Bound Example
tlogtt\log t (KL) q(p/q)log(p/q)+(1q)((1p)/(1q))log((1p)/(1q))q(p/q)\log(p/q) + (1-q)((1-p)/(1-q))\log((1-p)/(1-q)) (1V)ln(1+V1V)(1-V)\ln\left(\frac{1+V}{1-V}\right)
(t1)2/2(\sqrt t-1)^2/2 (Hellinger) q(p/q1)2/2+(1q)((1p)/(1q)1)2/2q(\sqrt{p/q}-1)^2/2 + (1-q)(\sqrt{(1-p)/(1-q)}-1)^2/2 11V21 - \sqrt{1-V^2}
(t1)2(t-1)^2 (χ2\chi^2) q((p/q)21)+(1q)(((1p)/(1q))21)q((p/q)^2 - 1) + (1-q)(((1-p)/(1-q))^2 - 1) V2/(q(1q))V^2 / (q(1-q))

All these bounds encode sharp relationships that are maximally attained for the extremal Bernoulli pairs (Guntuboyina et al., 2013, Lanier et al., 24 Jan 2025, 0903.1765, 0906.1244).

7. Applications and Extensions

The Bernoulli ff-divergence inequality underpins several advanced methods:

  • Interactive statistical decision making: The reduction and inversion to two-sided intervals for monotone transforms of risk (e.g., for prior-predictive CVaR and quantile lower bounds) (Bongole et al., 17 Jan 2026).
  • Transfer to quantum divergences: The inequalities lift directly to quantum settings by reduction to classical analogues on two-point supports, sidestepping complex matrix analysis (Lanier et al., 24 Jan 2025).
  • Information-theoretic converse bounds: Generalization of Fano’s inequality and derivation of tight explicit bounds for loss probabilities, exponential moments, and tail risks.

The Bernoulli ff-divergence inequality is thus a foundational tool for optimally relating statistical divergences under minimal informativeness constraints, with broad implications for hypothesis testing, risk minimization, and quantum information theory.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Bernoulli f-Divergence Inequality.