Geometric Median-of-Means

Updated 12 December 2025

Geometric median-of-means is a robust estimation technique that partitions data into blocks and aggregates estimates using the geometric median, ensuring sub-Gaussian performance even under heavy-tailed distributions.
It extends the classical median-of-means approach to non-Euclidean settings like Banach and CAT(0) spaces with minimal moment assumptions and a high breakdown point.
The method achieves dimension-free concentration and has broad applications in robust PCA, sparse regression, and low-rank matrix recovery, with computational methods such as Weiszfeld’s algorithm.

The geometric median-of-means (GMoM) estimator is a robust method for mean and parameter estimation in vector spaces, Banach spaces, and more generally in metric spaces with non-positive Alexandrov curvature. The approach uses block-wise aggregation and a geometric median to provide dimension-free, sub-Gaussian exponential concentration, even for heavy-tailed distributions and in infinite-dimensional or curved spaces. Its formulation generalizes the classical median-of-means (MoM) and extends robust aggregation principles to non-Euclidean settings with minimal moment assumptions.

1. Foundations and Definitions

The GMoM method fundamentally extends the classical MoM procedure from the Euclidean setting to general metric spaces, including reflexive Banach and globally non-positively curved (NPC, CAT(0)) spaces (Yun et al., 2022, Minsker et al., 2023, Minsker, 2013).

Let $(\mathcal X, d)$ be a complete and separable metric (Polish) space, for which the notion of a mean is given by the population Fréchet mean: $\mu^* \in \arg\min_{x \in \mathcal{X}} \int d(x, y)^2\, dP(y)$ For i.i.d. samples, the empirical mean often fails to exhibit sub-Gaussian concentration unless strong tail conditions are imposed. In contrast, the GMoM construction achieves robust concentration under only a second moment assumption, leveraging the geometry of the space.

For a given set of block estimators $\hat\theta_1, ..., \hat\theta_k$ (such as local means on data splits), the geometric median is defined as

$m = \operatorname{med}(\hat\theta_1, \dots, \hat\theta_k) := \arg\min_{x \in \mathcal X} \sum_{i=1}^k d(x, \hat\theta_i)$

Uniqueness holds in strictly convex Banach spaces unless all $\hat\theta_i$ are collinear (Minsker, 2013).

2. Construction of the Geometric Median-of-Means Estimator

The procedure consists of partitioning the sample into $k$ disjoint blocks, computing a "weak" estimator (typically the mean or an M-estimator) on each block, and aggregating these estimators by their geometric median (Yun et al., 2022, Minsker et al., 2023, Minsker, 2013).

Block Construction and Aggregation

Let $X_1, ..., X_n$ be i.i.d. observations:

Partition indices into $k$ groups $G_1, ..., G_k$ , each of size $m = \lfloor n/k \rfloor$ .
For each block $G_j$ , compute the block estimator (e.g., $\bar Y_j$ in $\mathbb{R}^d$ ):

$\bar Y_j = \frac{1}{|G_j|} \sum_{i \in G_j} X_i$

Aggregate the $\bar Y_j$ by geometric median:

$\hat\mu_{\text{MoM}} = \arg\min_{z \in \mathcal{X}} \frac{1}{k} \sum_{j=1}^k d(z, \bar Y_j)$

This extends to general weak estimators $T(\cdot)$ in Banach or metric spaces.

Comparison by Tournament

In non-Euclidean settings, a tournament metaphor is used: for loss function $\eta$ , blockwise empirical risks are computed. Point $a$ "defeats" point $b$ if

$\#\{j : F_{n, j}(a) \leq F_{n, j}(b)\} > \frac{k}{2}$

The GMoM estimator $\hat x_{MM}$ is the point whose defeating region (minimum radius covering all its defeaters) is smallest: $\hat x_{MM} \in \arg\min_{x \in \mathcal X} r_x$ where $r_x = \min\{ r\,:\, S_x \subset B(x, r) \}$ and $S_x$ is the set of points defeating $x$ .

3. Statistical Guarantees: Concentration and Robustness

The GMoM estimator achieves exponential-type (sub-Gaussian) concentration tails under minimal assumptions—even for heavy-tailed or infinite-dimensional settings (Yun et al., 2022, Minsker et al., 2023, Minsker, 2013).

Main Results

Space/class	Concentration/tail bound	Required assumptions
Euclidean/Banach	$\\|\hat\mu_{MoM} - \mu\\| \lesssim \sqrt{(\tr \Sigma) / n}$ with prob. $1 - O(e^{-k})$	Finite variance
NPC metric space	$\Pr(d(\hat x_{MM}, \mu^*) \leq R_q)\geq 1-\Delta$ , $R_q = O(\sqrt{\ln(1/\Delta) / n})$	Only second moment, entropy
Heavy-tailed classes	Dimension-free $R_q$ for effective rank $r(\Sigma)$	Model-dependent constants
Infinite-dimensional	Rates depend on entropy exponent $\zeta$	Polynomial covering entropy

The breakdown point is high: up to $\lfloor k/2\rfloor$ blocks can be arbitrarily corrupted without loss of control (Yun et al., 2022, Minsker, 2013).

For the geometric median itself, sub-Gaussian deviation holds even for merely finite second moment, with

$\Pr(\|\hat m - m\| \geq u) \leq 2 \exp\left(- \frac{C k u^2}{\tr(\Sigma)} \right)$

(Minsker et al., 2023).

Geometric Inequalities and Exponentially Small Tails

Key inequalities underlying these results include the NPC curvature-based CN inequality, the quadruple (growth) inequality, and tailored variance inequalities. These induce the dimension/curvature dependencies in the entropy chaining step for concentration (Yun et al., 2022).

4. Extensions and Generalizations

The framework covers not only squared metric ( $\eta(x, y) = d(x, y)^2$ ), but more generally power metrics $\eta(x, y) = d(x, y)^\alpha$ for $1 < \alpha \leq 2$ , with the notion of "fractional defeat" providing additional flexibility in defining the estimator. Under appropriate power-type CN and variance inequalities, analogous finite- and infinite-dimensional rates hold (Yun et al., 2022).

The method encompasses a wide range of loss functions and "weak" block estimators (e.g., robust regression, low-rank matrix recovery, covariance estimation), combining them via the geometric median aggregation (Minsker, 2013).

5. Algorithmic and Computational Aspects

The geometric median is typically computed via Weiszfeld's algorithm or smoothed relaxations. For $k$ vectors in $\mathbb{R}^d$ , each Weiszfeld iteration requires $O(kd)$ operations; empirical convergence is fast. For non-smooth settings, the Charbonnier-relaxation with accelerated gradient or Newton-type methods yields iterates whose objective suboptimality translates directly into bounds on the median error (Minsker et al., 2023).

In general metric or manifold settings, global optimization is computationally challenging, and no general polynomial-time algorithm exists outside Euclidean or Banach spaces (Yun et al., 2022). However, the block structure is naturally parallelizable, as each block estimator is computed independently.

6. Applications and Empirical Performance

The GMoM estimator has proven effective in:

Mean estimation under heavy-tailed noise in high-dimensional or infinite-dimensional spaces.
Robust PCA: aggregating blockwise sample covariances by geometric median yields sub-Gaussian deviation bounds in spectral norm.
Sparse linear regression: aggregating Lasso solutions via GMoM, providing high-probability recovery rates without sub-Gaussian noise assumptions.
Low-rank matrix recovery: blockwise nuclear norm regression, combined via Frobenius-norm geometric median, achieves exponential deviation bounds.
Financial time series: geometric MoM for log-returns outperforms classical mean and entrywise medians in predictive accuracy and stabilization (Minsker et al., 2023, Minsker, 2013).

7. Limitations and Structural Considerations

Key limitations and structural aspects include:

The method requires only a second moment, substantially relaxing classical moment assumptions; no higher moments are needed.
Dimension-free behavior depends on effective rank rather than ambient dimension in heavy-tailed settings (Minsker et al., 2023).
Exponential concentration relies crucially on non-positive curvature; the CN and quadruple inequalities do not extend to spaces with positive curvature (Yun et al., 2022).
Entropy and covering number assumptions provide the link between geometric complexity and concentration rates in infinite-dimensional settings.
The breakdown point is explicit and high, promoting robustness to adversarial contamination at the block level.

The geometric median-of-means principle thus constitutes a robust, theoretically grounded, and flexible approach for estimation under minimal assumptions in complex metric spaces (Yun et al., 2022, Minsker et al., 2023, Minsker, 2013).