Alpha-Beta Mean Distortion

Updated 9 August 2025

Alpha-Beta Mean Distortion is a parameterized family of discrepancy measures that interpolates between classical divergences like Kullback–Leibler, Itakura–Saito, and Hellinger distances.
It extends to complex vectors and positive definite operators, offering robust centroids that adapt to amplitude and phase variations in signal processing and statistical estimation.
Optimization of the mean distortion underpins applications in model selection, rate-distortion analysis, and nonparametric estimation while balancing bias and error.

Alpha-Beta Mean Distortion is a parameterized family of distortion measures originating from alpha-beta divergences, designed to interpolate and generalize several classical discrepancies (including Kullback–Leibler, Itakura–Saito, and Hellinger) between probability densities, signals, or more general positive definite objects. Recent work extends these measures beyond nonnegative scalar data to complex vectors and positive definite operators, resulting in distortion functionals widely applicable in statistical inference, signal processing, functional data analysis, and nonparametric estimation. Optimization of alpha-beta mean distortion yields "centroids" or means exhibiting invariance properties determined by the underlying divergence parameters, allowing adaptation to domain-specific data characteristics and noise models.

1. Mathematical Foundations: Alpha-Beta Divergences and Mean Distortion

Alpha-beta divergences are two-parameter families that parameterize many classical separable divergences. For nonnegative real arguments $p, q \geq 0$ , the divergence is expressed as:

$D^{\alpha, \beta}(p, q) = \frac{- (\alpha+\beta) p^\alpha q^\beta + \alpha p^{\alpha+\beta} + \beta q^{\alpha+\beta}}{\alpha \beta (\alpha+\beta)}$

with appropriate extension by continuity for limiting values of $\alpha$ or $\beta$ . By varying hyperparameters, one can recover widely used divergences:

$\alpha=1$ , $\beta=0$ : Kullback–Leibler divergence
$\alpha=1$ , $\beta=1$ : Euclidean distance
Other specific settings yield Itakura–Saito or Hellinger distances.

The "mean distortion" interprets the divergence as the average discrepancy between observed and modeled data, i.e.

$D_{\text{mean}} = \mathbb{E}[ D^{\alpha, \beta}(X, M) ]$

where $M$ is the minimizer considered as the centroid. This expected distortion is directly tied to statistical modeling assumptions about the data or noise generating process (e.g. Tweedie models, exponential families (Yilmaz et al., 2012)).

2. Extension to Complex Data and Signal Processing Applications

The alpha-beta mean distortion for complex vectors $p, q \in \mathbb{C}^m$ is constructed by decomposing the divergence into magnitude and angle components. If $p = \|p\| \hat{p}$ and $q = \|q\| \hat{q}$ are written in polar form, the divergence is:

$D^{\alpha, \beta}(p, q) = D^{\alpha, \beta}_{\text{AB}}(\|p\|, \|q\|) + \|p\|^\alpha \|q\|^\beta (1 - \cos \angle(p, q))$

where the angular part quantifies misalignment and is weighted by the norms and hyperparameters. This generalization recovers Euclidean or Mahalanobis squared distances for $\alpha = \beta = 1$ and adapts to applications requiring phase-sensitive measures, such as spectral analysis or direction-of-arrival estimation (Cruces, 5 Aug 2025).

3. Centroid and Minimization of Mean Distortion

The optimal "centroid" $c_s^{(\alpha, \beta)}$ minimizing mean alpha-beta distortion across a dataset $\{x_n\}$ is given by a closed-form expression involving a generalized mean and a phase-corrected, norm-adjusted term:

Define the $\alpha$ -transformation: $x^{(\alpha)} = \|x\|^{\alpha-1} x$
Compute the optimal direction: $\hat{c}_s = \frac{\mathbb{E}[x^{(\alpha)}]}{\|\mathbb{E}[x^{(\alpha)}]\|}$
Compute the generalized mean: $M_\alpha(\{\|x_n\|\}) = (\sum_n \nu_n \|x_n\|^\alpha )^{1/\alpha}$ (or geometric mean if $\alpha=0$ )
Define misalignment index: $\xi_\alpha = \frac{\mathbb{E}\left[\|x^{(\alpha)}\|\right] - \left\|\mathbb{E}[x^{(\alpha)}]\right\|}{\mathbb{E}\left[\|x^{(\alpha)}\|\right]}$
The centroid is

$c_s^{(\alpha, \beta)} = M_\alpha(\{\|x_n\|\}) \cdot \exp_{1-\alpha}(-\beta \xi_\alpha) \cdot \hat{c}_s$

where $\exp_{1-\alpha}(y) = (1 + \alpha y)_+^{1/\alpha}$ for $\alpha \neq 0$ ; $\nu_n$ are the sample weights.

This centroid construction offers direct control over sensitivity to amplitude (by $\alpha$ ) and phase misalignment (by $\beta$ ), essential for robust estimation in array processing, communications, and other domains handling complex signals (Cruces, 5 Aug 2025).

4. Connection to Tweedie Models and Statistical Noise Assumptions

Alpha-beta mean distortion extends concepts from exponential family statistics. Tweedie models, characterized by variance functions $v(\mu) = \mu^p$ , generate alpha-beta divergences as Bregman or Csiszár $f$ -divergences, where minimization is equivalent to maximum likelihood estimation under the corresponding noise model:

Gaussian ( $p=0$ ): quadratic loss, least squares
Poisson ( $p=1$ ): KL divergence
Gamma ( $p=2$ ): Itakura–Saito divergence

Thus, mean distortion reflects not only empirical discrepancy but encodes the underlying probabilistic structure and variance scaling of the data model (Yilmaz et al., 2012).

5. Infinite-Dimensional Operators and Geometry of Means

For positive definite operators (matrices, unitized trace class operators), the alpha-beta mean distortion leverages Log-Determinant divergences:

$D_{\alpha,\beta}(A,B) = \frac{1}{\alpha\beta} \log \det \left( \frac{\alpha (A^{-1}B)^{\beta} + \beta (A^{-1}B)^{-\alpha}}{\alpha + \beta} \right )$

where suitable extensions (Fredholm determinants) apply for infinite-dimensional spaces. Minimization of aggregate distortion across a set $\{A_i\}$ yields an operator mean possessing affine and congruence invariance and interpolates Riemannian (geometric) and Stein means as limiting cases. This operator-theoretic approach is fundamental for statistical analysis on RKHS covariance operators and for kernel-based machine learning methods (Quang, 2016).

6. Rate-Distortion, Constrained Optimization, and Model Selection

In lossy compression and autoencoder training, mean distortion serves as the constraint or targeted loss. The conventional $\beta$ -VAE approach

$L(\theta) = D(\theta) + \beta R(\theta)$

is replaced with distortion-constrained optimization:

$\min_\theta R(\theta) \quad \text{subject to} \quad D(\theta) \leq c_D$

with Lagrangian

$\mathcal{L}(\theta,\lambda^D) = R(\theta) + \lambda^D \left( \frac{D(\theta)}{c_D} - 1 \right)$

This facilitates direct control over the mean distortion level, improving robustness to hyperparameter tuning and enabling direct model comparisons at fixed distortion. Such methodologies outperform unconstrained $\beta$ -VAE training in rate-distortion targeting and model selection accuracy (Rozendaal et al., 2020).

7. Nonparametric Estimation and Bias–Mean Absolute Deviation Trade-off

In nonparametric statistics, the mean distortion as measured by mean absolute deviation (MAD) exhibits universal lower bounds in the bias-error trade-off. For pointwise estimation in the Gaussian white noise model and $\beta$ -Hölder smooth functions:

$\mathbb{E}[| \hat{f}(x_0) - \mathbb{E}[\hat{f}(x_0)] |] \gtrsim n^{-\beta/(2\beta+1)}$

when the estimator bias is at most $c n^{-\beta/(2\beta+1)}$ . This bound, an extension of the er-Rao lower bound to MAD, affirms that mean distortion cannot be arbitrarily minimized independently of systematic error, even in overparameterized settings (Derumigny et al., 2023).

Alpha-Beta Mean Distortion offers a versatile, theoretically grounded framework linking divergence-based discrepancy measures to optimal centroids, statistical modeling, signal processing, operator geometry, and fundamental trade-offs in estimation theory. Its parametrization and closed-form solutions underpin statistical robustness, geometric adaptability, and practical efficiency across a broad spectrum of application domains.