Barycentric Predictive Advantage in NPC Spaces
- Barycentric predictive advantage is a framework that generalizes exponentially weighted aggregation from Euclidean to nonpositively curved (NPC) spaces by replacing linear averages with unique barycenters (Fréchet means).
- It preserves classical regret bounds and statistical guarantees through a geometric adaptation of Jensen’s inequality, ensuring robust performance in curved settings.
- The method enables applications in areas like hyperbolic embeddings, SPD matrix spaces, and phylogenetic trees, facilitating effective online-to-batch conversion and aggregation in non-Euclidean contexts.
The barycentric predictive advantage refers to the extension of exponentially weighted aggregation from linear (vector) settings to general nonpositively curved (NPC) geodesic metric spaces via the replacement of linear averages by barycenters. This generalization preserves core regret and statistical guarantees of classical prediction with expert advice frameworks, enabling strong predictive aggregation in geometric contexts such as hyperbolic spaces, symmetric positive definite (SPD) matrix spaces, or phylogenetic trees. The barycentric framework relies critically on the existence and uniqueness of barycenters (Fréchet means) in NPC spaces and employs a geometric generalization of Jensen’s inequality, allowing the essential steps of classical regret analysis and batch conversion to go through without additional curvature penalties. As a result, a wide array of prediction and aggregation tasks—previously limited to Euclidean geometry—are now accessible and theoretically grounded in general NPC domains (Paris, 2020).
1. Barycenters in Nonpositively Curved Spaces
Let denote a complete geodesic metric space satisfying Alexandrov’s curvature condition . For the set of Borel probability measures on with finite second moments,
the barycenter or Fréchet mean of is any minimizer
In an NPC space, this minimizer exists and is unique. The sturm inequality holds:
and for geodesically convex ,
0
This geometric Jensen's inequality generalizes the classical vector-space Jensen and is essential for regret analysis in this setting.
2. Exponentially Weighted Aggregation via Barycenters
In the prediction with expert advice framework, at each round 1 experts 2 propose predictions 3. A prior 4 is fixed on 5 together with a learning-rate sequence 6. The cumulative loss of expert 7 at round 8 is 9. The corresponding Gibbs measure is
0
Whereas in the Euclidean setting, the prediction is the linear average
1
in an NPC space, 2 is the barycenter of 3 pushed forward by 4:
5
For finite 6, the prediction 7 is the unique minimizer of
8
with discrete weights 9.
3. Regret Bounds and Geometric Jensen's Inequality
Assuming a loss of the form 0 such that 1 is geodesically concave (e.g., squared-distance losses in bounded NPC spaces with sufficiently small 2), the normalizing partition function
3
yields, by classical Gibbs-variational arguments,
4
The analysis in NPC spaces invokes the geometric Jensen inequality, ensuring
5
and consequently,
6
This yields a uniform regret bound:
7
For finite 8 of size 9 and uniform prior, the regret 0 is bounded by 1. The geometric structure requires only substitution of barycenter and geometric Jensen for linear average and classical Jensen, with no further curvature-dependent terms.
4. Online-to-Batch Conversion in NPC Geometry
Any online forecaster with regret bound
2
can be converted into a batch estimator in 3 as
4
where 5 is the online predictor at round 6 based on 7. In the Euclidean case, this reduces to the linear mean; here, it is the barycenter. The geometric Jensen inequality yields
8
When 9 (as in exponentially weighted aggregation), the standard rate 0 is obtained.
5. Applications: Aggregation and Barycenter Estimation
Two principal classes of applications exemplify the barycentric predictive advantage:
(a) Aggregation of Non-Euclidean Predictors: For a finite family of predictors 1, where 2 is NPC, the 3 space of measurable maps (with metric 4) remains NPC. The barycentric EWA mechanism provides PAC-Bayes-type oracle inequalities in this setting, generalizing results from vector-valued to arbitrary curved-output models (e.g., hyperbolic embeddings, SPD-valued regressors, or phylogenetic-tree predictors).
(b) Barycenter Estimation: For i.i.d. 5 in 6, the intrinsic barycenter 7 of 8 minimizes 9. Applying the online-to-batch aggregator where each expert predicts a constant 0, the estimation error satisfies
1
with potentially sharper bounds via KL-penalized priors. This achieves an 2 rate for Fréchet mean estimation, without requiring lower-curvature bounds or covering-number assumptions.
6. Scope and Limitations
The barycentric predictive advantage consists of wholly transferring key statistical and regret guarantees of exponential aggregation to NPC settings by replacing linear averages with barycenters and utilizing the geometric Jensen inequality. No additional performance penalty arises, provided the loss and learning rates satisfy the criteria of geodesic concavity. Thus, this approach renders exponentially weighted aggregation universally applicable in NPC domains, extensively broadening methodological scope beyond traditional vector spaces (Paris, 2020).