Dynamic Mean Pairwise Distance (DMPD) Metrics

Updated 10 January 2026

DMPD is a unifying metric that quantifies pairwise dissimilarity in dynamic or high-dimensional systems, uniting notions of temporal and spatial divergence across diverse domains.
It is applied in LLM-generated code profiling, polymer chain analysis, and high-dimensional change-point detection to reveal operational instability, conformational scaling, and abrupt transitions.
Methodological comparisons show that DMPD’s shape-based, scale-invariant design provides robust, actionable insights into system stability, structure, and statistical shifts.

Dynamic Mean Pairwise Distance (DMPD) is a unifying term for a class of metrics quantifying pairwise dissimilarity in dynamic or high-dimensional systems, with distinct formalizations across computational program profiling, polymer physics, and change-point detection in statistics. Although implementations differ by field, DMPD always encodes an averaged or collective notion of distance—measured over time, along chains, or across generations—providing insight into stability, conformity, or shifts in underlying processes. The following entry surveys rigorously the three principal formal definitions, quantitative properties, and applied methodologies of DMPD, strictly as found in major arXiv contributions.

1. Solution-Level Runtime Divergence: DMPD in LLM-Generated Code

DMPD was introduced as a solution-level instability metric for evaluating runtime memory divergence among multiple correct program generations by LLMs (Rajput et al., 3 Jan 2026). When several code completions pass all functional tests, their operational behaviors—such as memory traces under real execution—can differ substantially, potentially introducing hidden risks. DMPD measures the mean temporal dissimilarity between correct solutions’ memory-usage profiles under test execution.

Formally, for each execution trace $T = [t_1, ..., t_n]$ , transient memory fluctuations are suppressed by constructing a Monotonic Peak Profile (MPP) $P = [p_1, ..., p_n]$ , where:

Baseline correction: $s'_t = \max(0, t_t - t_1)$ for $t=1...n$
Cumulative maximum (monotonic envelope): $p_1 = 0$ , $p_t = \max(p_{t-1}, s'_t)$ for $t=2...n$

Pairs of MPPs $P^a$ and $P^b$ from two solutions are normalized to unit peak before comparison. DMPD $(P^a,P^b)$ is then defined as the mean per-step Dynamic Time Warping (DTW) path cost with local $L^1$ distance, enforcing shape-based, scale-free comparison:

$\text{DMPD}(P^a, P^b) = \frac{D_\text{cum}(n,m)}{|\pi^*|} \;\; \in [0,1]$

where $D_\text{cum}(n,m)$ is the total alignment cost, and $|\pi^*|$ is the optimal warping path length.

Aggregating across $k$ correct traces yields the averaged solution-level instability:

$\text{DMPD}_k = \frac{2}{k(k-1)} \sum_{1 \leq p < q \leq k} \text{DMPD}(P^p, P^q)$

This process is performed per private test. Problem- or model-level instability scores are then computed by averaging DMPD across all problems and tests, leading to Model Instability Score (MIS) metrics (macro/micro variants), quantifying aggregate execution-time divergence.

Key findings include:

DMPD reveals substantial operational divergence even among functionally correct LLM solutions.
Instability (MIS) increases with higher sampling temperature, showing a tradeoff: higher diversity improves pass@1 but often at the cost of more runtime divergence.
DMPD shows orthogonality with static/textual code similarity metrics, highlighting its role as a novel execution-centric measure.
Shape-based definition of DMPD yields robustness to input-size scaling, outperforming amplitude-centric baselines.
Empirical correlations are observed between high DMPD (and related metrics like normalized maximum velocity, NMV) and software maintainability properties such as Cognitive/Cyclomatic complexity, suggesting operational behavior is linked to code maintainability.

These features support stability-aware candidate selection in CI/CD workflows and advocate for operational profiling as a complement to functional correctness (Rajput et al., 3 Jan 2026).

2. DMPD in Polymer Chain Physics: Loop-Extruded Rouse Models

In the statistical physics of polymers, DMPD quantifies spatial compactness and scaling properties in active loop-extruded Rouse chains (Nikitin et al., 8 May 2025). Here, DMPD $(s)$ is defined as the root-mean-square spatial separation between monomers/beads at contour separation $s$ :

$\text{DMPD}(s) \equiv \left\langle |\mathbf{r}_i - \mathbf{r}_j|^2 \right\rangle^{1/2}, \quad s = j - i$

The model incorporates:

Thermal Rouse dynamics (diffusion + harmonic connectivity)
Stochastic binding/unbinding of loop-extruding motors at rates $k_{in}$ , $k_{off}$
Rare-loops (non-overlapping), symmetric rapid extrusion

The stationary mean-square separation $\langle R^2(s) \rangle$ is computed by solving a linear system (from a diagrammatic, one-loop approximation to the Fokker-Planck equation) involving bead-bond correlations $\mu(m)$ , determined by kinetic coefficients and dimensionless control parameters. DMPD $(s)$ then provides a spatial profile of chain conformation.

A related log-slope $\alpha(s) = \frac{d \ln \langle R^2(s) \rangle}{d \ln s}$ detects sub-diffusivity and multiscaling. Analytical and numerical results demonstrate:

In the "frozen-disorder" (static loop) regime: $\alpha(s)$ has a single minimum at $s \sim \lambda$ (mean loop size).
Under dynamic extrusion (finite $k_{in}$ , $k_{off}$ ): $\alpha(s)$ shows nested minima and maxima, indicating multi-scale polymer compaction.
DMPD $(s)$ profiles are markedly non-monotonic, with richer structure than equilibrium models.

Fitting experimental DMPD $(s)$ enables extraction of physical parameters (extrusion rates, loop sizes), thus characterizing non-equilibrium chromatin folding in vivo (Nikitin et al., 8 May 2025).

3. DMPD-Based Change-Point Detection in High Dimensions

A third formalization appears in high-dimensional change-point analysis, where the dynamic mean pairwise distance serves as a test statistic for detecting abrupt distributional shifts in sequences $(Z_1, ..., Z_n)$ , $Z_i \in \mathbb{R}^d$ (Ghoshal et al., 13 Nov 2025). For each candidate change-point $t$ :

Compute within-left $T_{11}(t)$ , within-right $T_{22}(t)$ , and cross-block $T_{12}(t)$ mean pairwise distances:

$T_{11}(t) = \binom{t}{2}^{-1} \sum_{1 \leq i < j \leq t} \|Z_i - Z_j\|, \quad T_{22}(t) = \binom{n-t}{2}^{-1} \sum_{t+1 \leq i < j \leq n} \|Z_i - Z_j\|$

$T_{12}(t) = \frac{1}{t(n-t)} \sum_{i=1}^t \sum_{j=t+1}^n \|Z_i - Z_j\|$

The dynamic mean pairwise distance statistic at $t$ is then:

$\mathcal{D}_n(t) = \{ T_{12}(t) - T_{11}(t) \}^2 + \{ T_{12}(t) - T_{22}(t) \}^2$

Scanning over admissible $t$ , the final scan statistic is $S_n = \max_{t} w(t) \mathcal{D}_n(t)$ with $w(t) = t(n-t)/n^2$ .

Permutation tests are used for significance. The method is provably consistent in classical ( $n \to \infty$ , fixed $d$ ), HDLSS ( $d \to \infty$ , fixed $n$ ), and joint high-dimensional ( $n,d \to \infty$ ) settings under mild conditions, leveraging the concentration of pairwise distances in high dimensions.

Extensions allow for non-Euclidean distances, kernelizations, block-wise statistics, and efficient online/streaming variants. Empirical benchmarks show high sensitivity in detecting both location and scale changes, especially in high-dimensional and heavy-tailed settings (Ghoshal et al., 13 Nov 2025).

4. Methodological Comparisons and Operational Distinctions

Distinctions arise among these uses of DMPD, as each applies the mean pairwise distance principle in different mathematical and algorithmic settings:

Application Domain	DMPD Definition	Primary Purpose
LLM-Generated Code (Rajput et al., 3 Jan 2026)	Mean-DTW cost between normalized MPPs	Solution-level runtime profiling
Loop-Extruded Polymers (Nikitin et al., 8 May 2025)	RMS bead separation as function of $s$	Quantify conformational scaling
Change-Point Detection (Ghoshal et al., 13 Nov 2025)	Between/within block mean distances	Detect abrupt distributional shifts

In code profiling, DMPD measures temporal execution-shape divergence, abstracted from input amplitude, providing an operational notion of stability. In polymer physics, DMPD is a geometric measure reflecting non-equilibrium folding statistics. In change-point statistics, it encodes a dynamic divergence functional specialized for high-dimensional hypothesis testing.

5. Practical Implementations and Empirical Findings

Each domain provides field-specific guidelines for practical DMPD implementation and analysis:

For code profiling, use line-based sampling with moderate quantization and select among correct generations by DMPD-median to ensure both reliability and maintainability (Rajput et al., 3 Jan 2026).
In polymer modeling, DMPD $(s)$ and its slope inform on the degree of non-equilibrium looping, with extraction of dynamical parameters from fitted curves (Nikitin et al., 8 May 2025).
For change-point analysis, precomputing distance matrices allows $O(n^2 d)$ overall complexity, with streaming-friendly updates. Permutation thresholds control false positives across dimensional regimes (Ghoshal et al., 13 Nov 2025).

Empirical studies support the superiority or orthogonality of DMPD-type statistics over classical amplitude-based or static-metric baselines, particularly in quantifying operational divergence, multi-scale spatial compaction, or abrupt distributional shifts.

6. Cross-Disciplinary Significance and Interpretive Cautions

Although "Dynamic Mean Pairwise Distance" is not a universal formula, it is a conceptual thread uniting shape-based, operational, and high-dimensional comparisons. Its utility arises from averaging local or temporal fluctuations to expose collective divergence or change—whether in program executions, biomolecular conformations, or statistical samples—beyond the reach of traditional pointwise or static measures.

A plausible implication is that further methodological synthesis of DMPD-style metrics could leverage their inherent invariances (to scale, alignment, or marginal projections) in a wide array of dynamical systems, robustness analyses, and high-dimensional learning tasks. However, cross-context translation requires care, as the mathematical and operational meanings of DMPD are context-sensitive and must be interpreted strictly as defined in the referring domain (Rajput et al., 3 Jan 2026, Nikitin et al., 8 May 2025, Ghoshal et al., 13 Nov 2025).