Predictive Diversity Score (PDS)

Updated 5 November 2025

Predictive Diversity Score (PDS) is a metric that quantifies statistical dissimilarity between probability distributions using a tunable parameter derived from Jensen-Shannon divergence.
It is constructed as a monoparametric family of metrics that ensures true metric properties for α in (0, 1/2], allowing fine control over sensitivity.
PDS finds practical applications in tasks such as time series segmentation and quantum state discrimination by detecting subtle changes and anomalies.

The Predictive Diversity Score (PDS) is a metric derived from the Jensen-Shannon divergence (JSD) family, designed to quantify statistical dissimilarity between probability distributions with tunable sensitivity. The PDS is constructed as a monoparametric family of metrics from the classical JSD, enabling fine-grained control over the metric properties and their sensitivity in discriminative, clustering, or segmentation tasks. The formalism supports both classical and quantum extensions and is particularly advantageous for applications involving time series segmentation, symbolic sequences, and quantum state discrimination.

1. Jensen-Shannon Divergence: Foundation for Predictive Diversity

The Jensen-Shannon divergence between two probability distributions $P$ and $Q$ over a finite alphabet is defined as

$D_{JS}(P, Q) = \frac{1}{2}\left[ D_{KL}\left(P, \frac{P+Q}{2} \right) + D_{KL}\left(Q, \frac{P+Q}{2} \right) \right]$

where the Kullback-Leibler divergence is $D_{KL}(P, Q) = \sum_i p_i \log_2 \left(\frac{p_i}{q_i}\right)$ . Alternatively, in terms of the Shannon entropy $H(P) = -\sum_i p_i \log_2 p_i$ ,

$D_{JS}(P,Q) = H\left(\frac{P+Q}{2}\right) - \frac{1}{2}H(P) - \frac{1}{2}H(Q)$

JSD is symmetric, bounded, always finite, and well-defined even for zero-probability events.

2. Monoparametric Family of Metrics: Definition and Properties

The core result underpinning the PDS is the existence of a monoparametric family of distance metrics based on the JSD: $d_\alpha(P, Q) = [D_{JS}(P, Q)]^\alpha$ where $\alpha \in (0, 1/2]$ . This family interpolates between JSD ( $\alpha=1$ ) and its square root ( $\alpha=1/2$ ), with only the latter interval producing true metrics (satisfying all four metric axioms: non-negativity, symmetry, identity of indiscernibles, and triangle inequality).

Metric Validity: For any $\alpha \in (0, 1/2]$ , $d_\alpha$ is a metric.
For $\alpha \geq 1$ , $d_\alpha$ is not a metric (violates triangle inequality).
For $\alpha \in (1/2,1)$ , $d_\alpha$ is conjectured not to be a metric; construction of counterexamples supports this claim, though a general proof is open.

Explicitly, for distributions $P$ and $Q$ ,

$d_{\alpha}(P, Q) = \left[ H\left(\frac{P+Q}{2}\right) - \frac{1}{2}H(P) - \frac{1}{2} H(Q) \right]^\alpha$

with $0 < \alpha \leq 1/2$ .

Summary of Metric Exponents

Exponent $\alpha$	Metric Property	Notes
$0 < \alpha \leq 1/2$	True metric	Main proven result
$1/2 < \alpha < 1$	Conjectured not a metric	Supported by counterexamples
$\alpha \geq 1$	Not a metric	Analytically proven

3. Sensitivity and Tunability of the Predictive Diversity Score

The parameter $\alpha$ in the PDS allows fine control over the sensitivity of the score:

Lowering $\alpha$ : Increases the magnitude and sensitivity to differences between distributions, making PDS particularly responsive to small or subtle changes.
Choosing $\alpha$ : Should be based on task-specific requirements; for example, for sequence segmentation, lower values within the valid range can enhance detection performance.

This tunability is practically significant for tasks where the detection of distributional changes or anomalies is crucial and where simple pointwise distances might lack sufficient discriminative power.

4. Application to Symbolic Sequence Segmentation

In symbolic sequence segmentation, the PDS serves as a flexible statistic for detecting change-points or nonstationary behavior. The main computation involves a moving–window version of the metric:

$d'_\alpha(\ell) = \left[ -\sum_i (\pi_1 f_i + \pi_2 g_i) \log (\pi_1 f_i + \pi_2 g_i) + \pi_1 \sum_i f_i \log f_i + \pi_2 \sum_i g_i \log g_i \right]^{\alpha}$

where $f_i$ , $g_i$ are symbol frequencies on either side of the segmentation point $\ell$ , and $\pi_1, \pi_2$ are relative window sizes. The position maximizing $d'_\alpha(\ell)$ estimates the change-point.

Empirical results show that using lower $\alpha$ yields more pronounced segmentation statistics, providing practical guidance for parameter selection in diverse segmentation scenarios.

5. Quantum Generalization of the Predictive Diversity Score

The PDS framework extends naturally to quantum information:

Quantum JSD metric:

$D_{JS1}(\rho,\sigma) = \max_{\{ E_i \}} D_{JS}(p_i, q_i)$

where $p_i = \mathrm{Tr}(E_i \rho)$ , $q_i = \mathrm{Tr}(E_i \sigma)$ , and the maximization is over all POVMs.

The family $[D_{JS1}(\rho,\sigma)]^\alpha$ defines a quantum metric for $\alpha \in (0,1/2]$ .
The quantum PDS thus serves as a bona fide metric for distinguishing density operators, with direct applicability to convergence testing, algorithm performance boundaries, and sensitivity analysis in quantum state space.

6. Implementation Considerations and Limitations

Computational Aspects

Cost: The computation involves histogram-based probability estimation and Shannon entropy evaluations; for moderate alphabet sizes or sequence segmentation, complexity scales linearly in sequence length and polynomially (linearly in most practical regimes) in alphabet size.
Parameter Selection: Explicit choice of $\alpha$ is critical, requiring validation or domain knowledge.
Domain of Applicability: While the PDS is well-defined for any pair of probability distributions on a finite alphabet, its quantum extension requires evaluation over all POVMs, which can be computationally demanding in high-dimensional Hilbert spaces.

Limitations

For $\alpha \notin (0,1/2]$ : The metric fails to satisfy triangle inequality, so one loses metric-space guarantees.
Applications with severely limited data: Reliability of empirical probability estimates may be compromised, potentially affecting the sensitivity and stability of the PDS.

7. Significance and Impact in Statistical Data Analysis

The introduction of the Predictive Diversity Score as a monoparametric JSD-based metric offers a modular, interpretable, and tunable measure of distributional dissimilarity. It grounds applications such as symbolic sequence segmentation, detection of nonstationarity, and quantum state discrimination in a rigorous metric framework, with the advantage of tunable sensitivity. The quantum extension corroborates the general applicability of the construction to broader informational and physical contexts, including quantum information processing.

Overall, the PDS (i.e., the family $d_\alpha(P,Q) = [D_{JS}(P,Q)]^\alpha$ ) provides a theoretically sound, practically robust alternative to single-parameter divergences in scenarios demanding accurate, flexible quantification of statistical diversity (Osán et al., 2017).

PDF Markdown Chat (Pro)

References (1)

Monoparametric family of metrics derived from classical Jensen-Shannon divergence (2017)

Follow Topic

Get notified by email when new papers are published related to Predictive Diversity Score (PDS).