Papers
Topics
Authors
Recent
Search
2000 character limit reached

Normalized L1 Distance: Scale-Invariant Metric

Updated 18 November 2025
  • Normalized L1 distance is a scale-invariant measure defined as the expected absolute difference normalized by the sum of absolute first moments, ensuring values between 0 and 1.
  • It provides closed-form expressions for standard distributions and connects to established indices, such as the Gini index and 1-Wasserstein metric.
  • Under specific independence and nonnegativity conditions, it satisfies key metric properties, thereby robustly quantifying statistical discrepancies in diverse applications.

The normalized L1L_1-distance, denoted Dnorm(X,Y)D_{\rm norm}(X, Y), is a probabilistic metric between real-valued integrable random variables XX and YY, widely studied for its applications in theoretical and applied fields, such as economics and physics. This distance is defined as the expected absolute difference between XX and YY, normalized by the sum of their absolute first moments. Structured to always lie between 0 and 1, it refines the traditional L1L_1-distance by providing a scale-invariant measure, particularly significant when comparing distributions of differing magnitudes. The normalized L1L_1-distance encapsulates and unifies several well-established concepts, including the Gini index, the Lukaszyk–Karmovsky metric, and emerges as a special instance within the framework of 1-Wasserstein optimal transport (Rolle, 2021).

1. Formal Definition and Properties

Let (Ω,A,P)(\Omega, \mathcal A, P) be a probability space with X,YL1(Ω)X, Y \in \mathcal L_1(\Omega), i.e., both are integrable real-valued random variables. The (compound) Dnorm(X,Y)D_{\rm norm}(X, Y)0-distance is

Dnorm(X,Y)D_{\rm norm}(X, Y)1

The normalized Dnorm(X,Y)D_{\rm norm}(X, Y)2-distance, defined when Dnorm(X,Y)D_{\rm norm}(X, Y)3, is

Dnorm(X,Y)D_{\rm norm}(X, Y)4

and Dnorm(X,Y)D_{\rm norm}(X, Y)5 when both expectations vanish. This yields Dnorm(X,Y)D_{\rm norm}(X, Y)6 for all such Dnorm(X,Y)D_{\rm norm}(X, Y)7.

Analyzing Dnorm(X,Y)D_{\rm norm}(X, Y)8 through the axioms of metric spaces:

  • Non-negativity: Dnorm(X,Y)D_{\rm norm}(X, Y)9.
  • Symmetry: XX0.
  • Reflexivity: XX1.
  • Identity of indiscernibles: XX2 if and only if XX3 almost surely, considering the standard identification of random variables up to almost sure equality.

In general, XX4 does not always satisfy the triangle inequality. However, under the condition that XX5, XX6, XX7 are mutually independent, integrable, and nonnegative (with at most one of them concentrated at zero), Rolle proves that XX8 satisfies the triangle inequality: XX9 This is achieved via a specific algebraic inequality involving the individual YY0-distances and first moments, leveraging what is termed a "Canberra-inequality" for all real YY1 (Rolle, 2021).

2. Closed-form Expressions for Standard Distributions

Explicit evaluation of YY2 is important in statistics and applied modeling. In the case of two independent Gaussians

YY3

the expected absolute difference reads

YY4

where YY5 and YY6 denote the cdf and pdf of the standard normal, respectively. The one-marginal expectation is

YY7

YY8 is then computed by substituting these closed forms.

For independent uniform variables YY9, XX0, the mean absolute difference is determined through an explicit double integration: XX1 with polynomials in endpoints providing concrete values in the cases of interval separation, inclusion, or general overlap. For pure separation (XX2), XX3 where XX4 and XX5 are midpoints of the respective intervals. Table summaries of case enumeration and formulas are presented in (Rolle, 2021).

3. Domains of Application and Illustrative Behavior

Normalized XX6-distance is prevalent in fields where scale invariance and robust discrepancy measures are essential. In economics, it appears as the Gini index (see §5). In physics, especially error analysis, XX7 is known as the Lukaszyk–Karmovsky metric.

Figures in (Rolle, 2021) exemplify behavior in the bivariate normal setup: as the correlation XX8 approaches 1, joint distributions concentrate on the diagonal, and XX9 (total dependence implies null normalized distance). In uniform distributions, YY0 interpolates from 0 (total overlap) to 1 (one variable identically zero and the other nondegenerate), with critical dependence on support overlap.

4. Connections to Classical Indices and Distances

The normalized YY1-distance not only unifies disparate applications but also recovers several established quantities:

  • Gini index: For a distribution YY2, the Gini mean difference is YY3. The Gini index is its normalized analogue:

YY4

Thus, YY5 is the Gini index viewed as the “autodistance” of a distribution.

  • Lukaszyk–Karmovsky metric: YY6, introduced in physics for uncertainty quantification, possesses reflexivity contrary to early misconceptions.
  • Optimal transport (1-Wasserstein): If YY7 are probability laws, the Monge–Kantorovich problem with YY8 cost leads to the 1-Wasserstein distance

YY9

where L1L_10 are the cdfs of L1L_11. For independent L1L_12, L1L_13 is the cost under the trivial product coupling.

5. Mathematical and Probabilistic Structure

The normalized L1L_14-distance defines a semimetric on the space of integrable random variables, becoming a full metric when restricted to independent variables, as established through the generalized triangle inequality. The proof involves verifying a nontrivial algebraic condition, ultimately relying on the positivity of the "Canberra-inequality" for all real L1L_15: L1L_16 This semimetric structure allows for flexible deployment across disparate random variable pairs and distributions, provided integrability conditions are met.

6. Illustrative Regimes and Range

L1L_17 assumes values in L1L_18, with limiting cases as follows:

  • L1L_19: holds if L1L_10 almost surely or, for instance, in the degenerate case where both random variables vanish.
  • L1L_11: as joint law of L1L_12 is concentrated on the diagonal (e.g., perfect dependence, high correlation).
  • L1L_13: occurs when one variable is almost surely zero while the other is integrable and nondegenerate (Rolle, 2021).

This range captures scenarios of perfect equality, maximal disparity, and interpolation governed by the probabilistic and algebraic relations between the random variables’ distributions.


Normalized L1L_14-distance thus provides a robust, interpretable, and mathematically grounded similarity measure unifying concepts from diverse fields, with rigorous theoretical guarantees and tractable formulae in common applied cases (Rolle, 2021).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Normalized L1 Distance.