Geometric Median-of-Means
- Geometric median-of-means is a robust estimation technique that partitions data into blocks and aggregates estimates using the geometric median, ensuring sub-Gaussian performance even under heavy-tailed distributions.
- It extends the classical median-of-means approach to non-Euclidean settings like Banach and CAT(0) spaces with minimal moment assumptions and a high breakdown point.
- The method achieves dimension-free concentration and has broad applications in robust PCA, sparse regression, and low-rank matrix recovery, with computational methods such as Weiszfeld’s algorithm.
The geometric median-of-means (GMoM) estimator is a robust method for mean and parameter estimation in vector spaces, Banach spaces, and more generally in metric spaces with non-positive Alexandrov curvature. The approach uses block-wise aggregation and a geometric median to provide dimension-free, sub-Gaussian exponential concentration, even for heavy-tailed distributions and in infinite-dimensional or curved spaces. Its formulation generalizes the classical median-of-means (MoM) and extends robust aggregation principles to non-Euclidean settings with minimal moment assumptions.
1. Foundations and Definitions
The GMoM method fundamentally extends the classical MoM procedure from the Euclidean setting to general metric spaces, including reflexive Banach and globally non-positively curved (NPC, CAT(0)) spaces (Yun et al., 2022, Minsker et al., 2023, Minsker, 2013).
Let be a complete and separable metric (Polish) space, for which the notion of a mean is given by the population Fréchet mean: For i.i.d. samples, the empirical mean often fails to exhibit sub-Gaussian concentration unless strong tail conditions are imposed. In contrast, the GMoM construction achieves robust concentration under only a second moment assumption, leveraging the geometry of the space.
For a given set of block estimators (such as local means on data splits), the geometric median is defined as
Uniqueness holds in strictly convex Banach spaces unless all are collinear (Minsker, 2013).
2. Construction of the Geometric Median-of-Means Estimator
The procedure consists of partitioning the sample into disjoint blocks, computing a "weak" estimator (typically the mean or an M-estimator) on each block, and aggregating these estimators by their geometric median (Yun et al., 2022, Minsker et al., 2023, Minsker, 2013).
Block Construction and Aggregation
Let be i.i.d. observations:
- Partition indices into groups , each of size .
- For each block , compute the block estimator (e.g., in ):
- Aggregate the by geometric median:
This extends to general weak estimators in Banach or metric spaces.
Comparison by Tournament
In non-Euclidean settings, a tournament metaphor is used: for loss function , blockwise empirical risks are computed. Point "defeats" point if
The GMoM estimator is the point whose defeating region (minimum radius covering all its defeaters) is smallest: where and is the set of points defeating .
3. Statistical Guarantees: Concentration and Robustness
The GMoM estimator achieves exponential-type (sub-Gaussian) concentration tails under minimal assumptions—even for heavy-tailed or infinite-dimensional settings (Yun et al., 2022, Minsker et al., 2023, Minsker, 2013).
Main Results
| Space/class | Concentration/tail bound | Required assumptions |
|---|---|---|
| Euclidean/Banach | $\|\hat\mu_{MoM} - \mu\| \lesssim \sqrt{(\tr \Sigma) / n}$ with prob. | Finite variance |
| NPC metric space | , | Only second moment, entropy |
| Heavy-tailed classes | Dimension-free for effective rank | Model-dependent constants |
| Infinite-dimensional | Rates depend on entropy exponent | Polynomial covering entropy |
The breakdown point is high: up to blocks can be arbitrarily corrupted without loss of control (Yun et al., 2022, Minsker, 2013).
For the geometric median itself, sub-Gaussian deviation holds even for merely finite second moment, with
$\Pr(\|\hat m - m\| \geq u) \leq 2 \exp\left(- \frac{C k u^2}{\tr(\Sigma)} \right)$
Geometric Inequalities and Exponentially Small Tails
Key inequalities underlying these results include the NPC curvature-based CN inequality, the quadruple (growth) inequality, and tailored variance inequalities. These induce the dimension/curvature dependencies in the entropy chaining step for concentration (Yun et al., 2022).
4. Extensions and Generalizations
The framework covers not only squared metric (), but more generally power metrics for , with the notion of "fractional defeat" providing additional flexibility in defining the estimator. Under appropriate power-type CN and variance inequalities, analogous finite- and infinite-dimensional rates hold (Yun et al., 2022).
The method encompasses a wide range of loss functions and "weak" block estimators (e.g., robust regression, low-rank matrix recovery, covariance estimation), combining them via the geometric median aggregation (Minsker, 2013).
5. Algorithmic and Computational Aspects
The geometric median is typically computed via Weiszfeld's algorithm or smoothed relaxations. For vectors in , each Weiszfeld iteration requires operations; empirical convergence is fast. For non-smooth settings, the Charbonnier-relaxation with accelerated gradient or Newton-type methods yields iterates whose objective suboptimality translates directly into bounds on the median error (Minsker et al., 2023).
In general metric or manifold settings, global optimization is computationally challenging, and no general polynomial-time algorithm exists outside Euclidean or Banach spaces (Yun et al., 2022). However, the block structure is naturally parallelizable, as each block estimator is computed independently.
6. Applications and Empirical Performance
The GMoM estimator has proven effective in:
- Mean estimation under heavy-tailed noise in high-dimensional or infinite-dimensional spaces.
- Robust PCA: aggregating blockwise sample covariances by geometric median yields sub-Gaussian deviation bounds in spectral norm.
- Sparse linear regression: aggregating Lasso solutions via GMoM, providing high-probability recovery rates without sub-Gaussian noise assumptions.
- Low-rank matrix recovery: blockwise nuclear norm regression, combined via Frobenius-norm geometric median, achieves exponential deviation bounds.
- Financial time series: geometric MoM for log-returns outperforms classical mean and entrywise medians in predictive accuracy and stabilization (Minsker et al., 2023, Minsker, 2013).
7. Limitations and Structural Considerations
Key limitations and structural aspects include:
- The method requires only a second moment, substantially relaxing classical moment assumptions; no higher moments are needed.
- Dimension-free behavior depends on effective rank rather than ambient dimension in heavy-tailed settings (Minsker et al., 2023).
- Exponential concentration relies crucially on non-positive curvature; the CN and quadruple inequalities do not extend to spaces with positive curvature (Yun et al., 2022).
- Entropy and covering number assumptions provide the link between geometric complexity and concentration rates in infinite-dimensional settings.
- The breakdown point is explicit and high, promoting robustness to adversarial contamination at the block level.
The geometric median-of-means principle thus constitutes a robust, theoretically grounded, and flexible approach for estimation under minimal assumptions in complex metric spaces (Yun et al., 2022, Minsker et al., 2023, Minsker, 2013).