Generalized Mean (α–β) Fusion

Updated 11 April 2026

Generalized Mean (α–β) Fusion is a framework that combines Tsallis coupled-surprisal with power means for robust, adaptive scoring in statistical risk assessment.
It employs coupled logarithms and escort distributions to balance decisiveness and robustness, catering to complex system and machine learning applications.
The method links parameterized deformations to effective probability metrics, enabling improved handling of heavy-tailed distributions and operational risk control.

The Tsallis coupled-surprisal is a generalized information metric that extends the classical notion of surprisal, emerging from nonextensive statistical mechanics and the theory of coupled entropy. It addresses key robustness and stability challenges present in both the original Tsallis entropy and its normalized variant, providing a parameterized framework for quantifying statistical risk, decisiveness, and robustness, while also connecting closely to the behavior of complex systems and heavy-tailed distributions.

1. Foundations: Tsallis Entropy, Surprisal, and Generalized Logarithms

The classical Tsallis entropy framework is predicated on deformations of the Shannon information measures. For a discrete distribution $\{p_i\}_{i=1}^W$ , the $q$ –logarithm and its inverse, the $q$ –exponential, are

$\ln_q(x) = \frac{x^{1-q} - 1}{1-q}, \quad \exp_q(x) = [1 + (1-q)x]_+^{1/(1-q)},$

where $[\cdot]_+ = \max\{0, \cdot\}$ . The Tsallis entropy itself is defined as

$S_q(\mathbf p) = -\sum_{i=1}^W p_i \ln_q(p_i) = \frac{1 - \sum_{i=1}^W p_i^q}{q-1},$

and the associated Tsallis surprisal (or $q$ –surprisal) for outcome $i$ is $s_q(p_i) = -\ln_q(p_i)$ , so that $S_q(\mathbf p) = \sum_i p_i s_q(p_i)$ .

The principal of nonlinear statistical coupling in this context introduces a deformed logarithm—the coupled logarithm—parametrized by $q$ 0 (coupling parameter), given as

$q$ 1

with inverse coupled-exponential $q$ 2. This deformation directly generalizes the Shannon case ( $q$ 3) and underlies the Tsallis coupled-surprisal concept (Nelson et al., 2011).

2. Instability of Normalized Tsallis Entropy and Emergence of Coupled Entropy

Nonextensive statistical mechanics often employs expectation constraints defined via the escort distribution: $q$ 4 To restore consistency within this setting, the normalized Tsallis entropy (NTE) was introduced: $q$ 5 However, $q$ 6 is inherently unstable, because as $q$ 7 or $q$ 8, the normalization $q$ 9 can vary dramatically, undermining continuity and violating Lesche-stability (Nelson, 17 May 2025).

To remedy this, a further normalization divides $q$ 0 by $q$ 1: $q$ 2 Here, $q$ 3 denotes the dimension (degrees of freedom), $q$ 4 is the coupling (tail-shape) parameter, and the effective Tsallis index is

$q$ 5

with $q$ 6 controlling the local shape near the location and $q$ 7 governing asymptotic tails.

3. Definition and Properties of the Tsallis Coupled-Surprisal

The Tsallis coupled-surprisal, $q$ 8, arises by expressing the basic contribution to coupled entropy using the coupled logarithm: $q$ 9 with associated coupled-surprisal

$\ln_q(x) = \frac{x^{1-q} - 1}{1-q}, \quad \exp_q(x) = [1 + (1-q)x]_+^{1/(1-q)},$ 0

The full coupled entropy is then $\ln_q(x) = \frac{x^{1-q} - 1}{1-q}, \quad \exp_q(x) = [1 + (1-q)x]_+^{1/(1-q)},$ 1.

Comparisons are summarized as follows:

Quantity	Tsallis $\ln_q(x) = \frac{x^{1-q} - 1}{1-q}, \quad \exp_q(x) = [1 + (1-q)x]_+^{1/(1-q)},$ 2-formulation	Coupled $\ln_q(x) = \frac{x^{1-q} - 1}{1-q}, \quad \exp_q(x) = [1 + (1-q)x]_+^{1/(1-q)},$ 3-formulation
Surprisal	$\ln_q(x) = \frac{x^{1-q} - 1}{1-q}, \quad \exp_q(x) = [1 + (1-q)x]_+^{1/(1-q)},$ 4	$\ln_q(x) = \frac{x^{1-q} - 1}{1-q}, \quad \exp_q(x) = [1 + (1-q)x]_+^{1/(1-q)},$ 5
Entropy	$\ln_q(x) = \frac{x^{1-q} - 1}{1-q}, \quad \exp_q(x) = [1 + (1-q)x]_+^{1/(1-q)},$ 6	$\ln_q(x) = \frac{x^{1-q} - 1}{1-q}, \quad \exp_q(x) = [1 + (1-q)x]_+^{1/(1-q)},$ 7
Maximizing distribution	$\ln_q(x) = \frac{x^{1-q} - 1}{1-q}, \quad \exp_q(x) = [1 + (1-q)x]_+^{1/(1-q)},$ 8-exponential family	Coupled exponential family

The coupling parameter $\ln_q(x) = \frac{x^{1-q} - 1}{1-q}, \quad \exp_q(x) = [1 + (1-q)x]_+^{1/(1-q)},$ 9 quantifies nonlinearity and nonadditivity, directly modulating the risk profile: $[\cdot]_+ = \max\{0, \cdot\}$ 0 yields heavy-tailed, decisive behaviors, while $[\cdot]_+ = \max\{0, \cdot\}$ 1 confers robustness by penalizing low-probability assignments more heavily (Nelson, 17 May 2025, Nelson et al., 2011).

4. Relationship to Effective Probability and Generalized Means

A fundamental operational property of the coupled-surprisal is its link to generalized (power) means. For $[\cdot]_+ = \max\{0, \cdot\}$ 2 forecast probabilities $[\cdot]_+ = \max\{0, \cdot\}$ 3, the average coupled-surprisal is

$[\cdot]_+ = \max\{0, \cdot\}$ 4

and the corresponding effective probability is obtained by inverting the coupled-exponential: $[\cdot]_+ = \max\{0, \cdot\}$ 5 which is the power mean of order $[\cdot]_+ = \max\{0, \cdot\}$ 6. Key limiting cases include:

$[\cdot]_+ = \max\{0, \cdot\}$ 7: recovers geometric mean and Shannon surprisal,
$[\cdot]_+ = \max\{0, \cdot\}$ 8: yields arithmetic mean and linear cost,
$[\cdot]_+ = \max\{0, \cdot\}$ 9: harmonic mean (Nelson et al., 2011).

This mapping allows one to operationalize the coupled-surprisal as a scoring rule directly connected to risk sensitivity: sliding $S_q(\mathbf p) = -\sum_{i=1}^W p_i \ln_q(p_i) = \frac{1 - \sum_{i=1}^W p_i^q}{q-1},$ 0 (or $S_q(\mathbf p) = -\sum_{i=1}^W p_i \ln_q(p_i) = \frac{1 - \sum_{i=1}^W p_i^q}{q-1},$ 1) tunes the severity of penalties for low-probability assignments.

5. Maximizing Distributions: The Coupled Exponential Family

Under constraints defined via the coupled expectation, $S_q(\mathbf p) = -\sum_{i=1}^W p_i \ln_q(p_i) = \frac{1 - \sum_{i=1}^W p_i^q}{q-1},$ 2, maximization of $S_q(\mathbf p) = -\sum_{i=1}^W p_i \ln_q(p_i) = \frac{1 - \sum_{i=1}^W p_i^q}{q-1},$ 3 yields distributions in the coupled exponential family: $S_q(\mathbf p) = -\sum_{i=1}^W p_i \ln_q(p_i) = \frac{1 - \sum_{i=1}^W p_i^q}{q-1},$ 4 Special cases include:

Linear $S_q(\mathbf p) = -\sum_{i=1}^W p_i \ln_q(p_i) = \frac{1 - \sum_{i=1}^W p_i^q}{q-1},$ 5: generalized Pareto distributions,
Quadratic $S_q(\mathbf p) = -\sum_{i=1}^W p_i \ln_q(p_i) = \frac{1 - \sum_{i=1}^W p_i^q}{q-1},$ 6: Student- $S_q(\mathbf p) = -\sum_{i=1}^W p_i \ln_q(p_i) = \frac{1 - \sum_{i=1}^W p_i^q}{q-1},$ 7 (coupled Gaussian) distributions, with tail index $S_q(\mathbf p) = -\sum_{i=1}^W p_i \ln_q(p_i) = \frac{1 - \sum_{i=1}^W p_i^q}{q-1},$ 8,
General $S_q(\mathbf p) = -\sum_{i=1}^W p_i \ln_q(p_i) = \frac{1 - \sum_{i=1}^W p_i^q}{q-1},$ 9-power: coupled Weibull (stretched-exponential) families.

The family thus interpolates between exponential/Gaussian laws ( $q$ 0) and heavy-tailed forms ( $q$ 1), supplying a principled basis for modeling complex system phenomenology (Nelson, 17 May 2025).

6. Practical Applications: Information Fusion, Statistical Complexity, and Machine Learning

The coupled-surprisal functions as a tunable scoring rule in decision-theoretic and machine learning contexts. In information fusion, as discussed in (Nelson et al., 2011), the $q$ 2 fusion algorithm combines input likelihoods using generalized means, with $q$ 3 (the coupling parameter) governing the smoothing/aggregation and $q$ 4 modulating effective independence. Adjusting the coupled-surprisal parameter allows practitioners to control the trade-off between decisiveness ( $q$ 5, optimistic, low cost for $q$ 6) and robustness ( $q$ 7, conservative, harsh penalty for $q$ 8), aligning the metric with risk preferences and application objectives.

In machine learning, especially in robust variational inference and coupled variational autoencoder models, coupled entropy provides an extra stabilizing factor. Sampling from the appropriate escort distributions with the additional $q$ 9 normalization dampens instabilities during training, making the method suitable for handling heavy-tailed data and model calibration (Nelson, 17 May 2025).

Physically, the coupling parameter $i$ 0 corresponds to the strength of statistical nonlinearity and interaction intensity, serving as a candidate measure for statistical complexity in heterogeneous and correlated environments.

7. Interpretations and Operational Significance

Accuracy: Risk-neutral, $i$ 1.
Decisiveness: $i$ 2, lower penalty for low probabilities, more sensitive to sharp forecasts.
Robustness: $i$ 3, higher penalty for low probabilities, less sensitive to outlier forecasts (Nelson et al., 2011).

This one-parameter deformation facilitates direct control over the information score’s sensitivity, providing an adjustable lens for evaluating model outputs and forecast probabilities according to domain-specific risk tolerances and desired operational characteristics.

The Tsallis coupled-surprisal thus constitutes a natural and robust extension of classical information measures, endowed with clear connections to both nonextensive statistical mechanics and practical statistical learning (Nelson, 17 May 2025, Nelson et al., 2011).

Markdown Report Issue Upgrade to Chat

References (2)

A risk profile for information fusion algorithms (2011)

Coupled Entropy: A Goldilocks Generalization? (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Generalized Mean (α-β) Fusion.