Papers
Topics
Authors
Recent
Search
2000 character limit reached

Hyperbolic Random Forests (HoroRF)

Updated 25 February 2026
  • Hyperbolic Random Forests (HoroRF) are ensemble classification techniques that use hyperbolic geometry, replacing Euclidean splits with horospheres and geodesic hyperplanes.
  • They improve modeling of hierarchical data by exploiting the exponential growth of hyperbolic space, leading to lower distortion in tasks like taxonomy and network classification.
  • Variants such as strict HoroRF, HyperRF, and Klein-wrapper offer trade-offs in accuracy, computational complexity, and scalability in hyperbolic ensemble methods.

Hyperbolic Random Forests (HoroRF) are ensemble classification algorithms that adapt the random forest paradigm to hyperbolic geometry, leveraging the exponential growth of hyperbolic space to effectively model data with hierarchical or tree-like structure. These methods replace Euclidean splits with hyperbolic decision boundaries, such as horospheres or homogeneous geodesic hyperplanes, and incorporate hyperbolic-specific optimization and impurity measures. Several principal algorithmic realizations exist, including horosphere-based forests ("HoroRF" in the strict sense), fast homogeneous-hyperplane forests (HyperRF), and Beltrami–Klein wrapper approaches enabling seamless integration with standard Euclidean decision-tree engines.

1. Hyperbolic Geometry and Motivation

Hyperbolic space, characterized by constant negative curvature, models exponentially expanding neighborhoods: the volume of a ball grows exponentially with radius, directly mirroring hierarchical structures (e.g., trees exhibit bb^\ell nodes at depth \ell for branching factor bb). When data naturally encode hierarchies, such as taxonomies, social networks, or text corpora, hyperbolic embeddings yield lower distortion than their Euclidean counterparts. Hyperbolic Random Forests thus target classification tasks where such hierarchical structure is prevalent, providing natural inductive bias and performance gains relative to classical approaches (Doorenbos et al., 2023).

2. Random Forests in Hyperbolic Space: Core Variants

Random forests rely on recursive space partitioning by tree-structured ensembles of binary splits. In hyperbolic space, three primary approaches have crystallized:

Approach Split Type Model Context
HoroRF Horospheres Lorentz/Poincaré
HyperRF (w/HyperDT) Homogeneous geodesic hyperplanes Hyperboloid
Klein-Wrapper (“Fast-HyperRF”) Axis-parallel hyperplanes in Klein model (w/ midpoint corrections) Beltrami–Klein
  • HoroRF (Doorenbos et al., 2023): Uses horospheres—iso-Busemann level sets or Lorentz lightlike hyperplanes—as decision boundaries. Splits are instantiated by large-margin optimization (HoroSVM), and multiclass/imbalanced variants leverage hyperclass aggregation and class-balanced loss.
  • HyperRF (Chlenski et al., 2023): Generalizes Euclidean axis-aligned splits to geodesic hyperplanes in the hyperboloid (Lorentz) model, specified by sparse normal vectors parameterized by spatial axis and hyperbolic angle.
  • Beltrami–Klein Wrapper (Fast-HyperRF) (Chlenski et al., 4 Jun 2025): Maps hyperboloid data to the Klein model via gnomonic projection, where splits are axis-parallel, and then applies Einstein midpoint corrections to thresholds to ensure hyperbolic geometric fidelity.

3. Mathematical Foundations

Hyperbolic random forests rest on several interlocking mathematical structures. In the (Lorentz) hyperboloid model, points xRD+1x \in \mathbb{R}^{D+1} satisfy x,xL=1/K\langle x, x \rangle_{\mathcal{L}} = -1/K, where the Minkowski inner product is

x,xL=x0x0+i=1Dxixi.\langle x, x' \rangle_{\mathcal{L}} = -x_0x_0' + \sum_{i=1}^D x_i x_i'.

Geodesic distance is

δ(x,x)=1Kcosh1(Kx,xL).\delta(x, x') = \frac{1}{\sqrt{K}} \cosh^{-1}\left(-K \langle x, x' \rangle_{\mathcal{L}} \right).

Hyperbolic splits are defined as follows:

  • Homogeneous geodesic hyperplanes: {x:x,aL=0}\{ x : \langle x, a \rangle_{\mathcal{L}} = 0 \} for timelike aa; for axis dd and angle θ\theta, the normal is n(d,θ)=(cosθ,0,...,sinθ,...,0)n^{(d,\theta)} = (-\cos\theta, 0, ..., \sin\theta, ..., 0) with nonzero sinθ\sin\theta at dd (Chlenski et al., 2023).
  • Horospheres: {xHn:w,xL+b=0}\{ x \in \mathbb{H}^n : \langle w, x \rangle_{\mathcal{L}} + b = 0 \} for lightlike ww, bRb \in \mathbb R; realized via Busemann functions in Poincaré or hyperboloid models (Doorenbos et al., 2023).
  • Beltrami–Klein axis thresholds: Gnomonic projection maps each point as ϕK(u0,u)=u/u0\phi_K\left(u_0, \vec{u}\right) = \vec{u}/u_0, so geodesic hyperplanes correspond to axis-parallel hyperplanes in Klein coordinates. Thresholds are corrected post-hoc using the Einstein midpoint mK(L,R)m_K(L, R) to guarantee equidistance in hyperbolic geometry (Chlenski et al., 4 Jun 2025).

4. Algorithmic Realization and Computational Aspects

Training and Inference Workflows

  • HyperRF (Homogeneous Hyperplanes):
    • For each node and spatial axis dd, candidate splitting angles θi\theta_i are computed from training samples as arctan(x0/xd)\arctan(x_0 / x_d).
    • Midpoint angles θm\theta_m use a closed-form formula rooted in hyperbolic geometry, not arithmetic mean, to ensure symmetric splits (Chlenski et al., 2023).
    • Splits use the sparse Lorentz inner product as S(x)=sign(sinθxdcosθx0)S(x) = \mathrm{sign}(\sin\theta x_d - \cos\theta x_0), computable in O(1)O(1).
    • Training a full tree is O(Dn2dmax)O(Dn2^{d_{\max}}) in the worst case, but early stopping on pure nodes is standard.
  • HoroRF (Horosphere Splits):
    • At each node, up to KK candidate splits (one-versus-rest and hyperclass groupings) are proposed.
    • Each split is trained using a large-margin horosphere classifier (HoroSVM) via convex optimization.
    • The best split is selected based on information gain; class-balanced loss and “hyperclasses” promote effective splits even under class imbalance and multiclass heterogeneity (Doorenbos et al., 2023).
  • Klein-Wrapper (Fast-HyperRF):
    • Preprocessing: Lorentz data projected to Klein disk via ϕK\phi_K.
    • Off-the-shelf Euclidean tree/forest applied, treating Klein coordinates as regular Euclidean vectors.
    • Postprocessing: For each internal split, the Euclidean threshold is replaced by the Einstein midpoint to match hyperbolic geometry.
    • Inference is either via full projection and vectorized prediction or node-wise coordinate computation (Chlenski et al., 4 Jun 2025).

Complexity Analysis

Variant Training Complexity Inference Complexity Scalability Comment
HoroRF O(KNidI)O(KN_idI) per node O(tree height)O(\text{tree height}) Quadratic in nn due to SVMs; overhead K\sim K vs Eucl. RF
HyperRF O(Dn)O(Dn) per node O(dmax)O(d_{\max}) per tree Linear in nn and DD; Python w/ scikit-learn compatibility
Klein-Wrap O(tdnlogn)O(tdn\log n) total O(tnh)O(tnh) Matches Euclidean RF complexity, “thousands of times” faster

For the Klein-wrapper, training speeds up to 3,752×3,752\times are reported for n=32,768n=32,768 (Chlenski et al., 4 Jun 2025), and split/test agreement with ad-hoc Lorentz implementations exceeds 99%.

5. Extensions for Multi-Class and Class Imbalance

HoroRF incorporates several mechanisms beyond what standard random forests provide:

  • Hyperclass grouping: Classes are merged recursively by Einstein-midpoint similarity to form superclasses, and splits are evaluated as one-vs-rest at each aggregation level. This enables grouping of more than one base class on a side of a split, often yielding more balanced and meaningful trees.
  • Class-balanced loss: Slack terms in the large-margin loss are reweighted inversely by class frequency, governing the impact of rare classes and improving split diversity in imbalanced settings

cb=12μ2+Ci=1N1β1βnyimax(0,1yi(μBw1(xi)o)).\ell_{\mathrm{cb}} = \frac12\mu^2 + C\sum_{i=1}^N \frac{1-\beta}{1-\beta^{n_{y_i}}} \max\left(0, 1 - y_i(\mu B_w^{-1}(x_i)-o)\right).

(Doorenbos et al., 2023).

6. Empirical Results and Practical Guidelines

Empirical studies evaluate HoroRF, HyperRF, and Klein-wrapper forests on synthetic Gaussian mixtures in HD,KH^{D,K}, hierarchical microbiome data (NeuroSEED), political blog graph embeddings, and WordNet-based hierarchical classification tasks. Representative findings include:

  • Accuracy (micro-F1): HyperDT (single-tree) and HyperRF achieve best/near-best results on >75%>75\% of tasks. HoroRF outperforms both hyperbolic and Euclidean random forest variants on hard hierarchical or imbalanced tasks such as WordNet subtrees or network node classification (Doorenbos et al., 2023, Chlenski et al., 2023, Chlenski et al., 4 Jun 2025).
  • Speed/Scalability: HyperRF and Klein-wrapper forests scale linearly in nn and DD; HoroRF is quadratically slower due to per-node SVM optimization.
  • Implementation: HyperRF and Klein approaches are provided via scikit-learn–compatible Python interfaces with multithreading; HoroRF requires custom manifold-optimization backends.
  • Parameter recommendations: In benchmarking, forests use T=12T=12 trees and depth dmax=3d_{\max}=3, with curvature KK matched to the embedding geometry (Chlenski et al., 2023). Hyperbolic midpoints (not arithmetic means) are crucial for split threshold accuracy.

Ablation studies confirm that naïve midpoint calculations degrade performance, but for Klein-wrapper forests, midpoint corrections improve geometric correctness but only marginally affect accuracy on most benchmarks (Chlenski et al., 4 Jun 2025). XGBoost/LightGBM backends can further increase performance on hyperbolic data.

7. Limitations and Research Directions

While hyperbolic random forest constructions capture hierarchically-structured data efficiently and provide strong empirical results, several limitations and open avenues persist:

  • HoroRF per-node cost is higher than standard or axis-aligned Euclidean forests due to multiple (hyperclass) and SVM-type optimizations (Doorenbos et al., 2023).
  • Scaling HoroRF to very large datasets requires efficient warm-starts, approximate optimizations, or potential GPU-accelerated manifold solvers.
  • Current approaches focus on classification; extension to regression, survival forests, or boosting frameworks in hyperbolic space remain active areas of research.
  • Wrapper approaches demonstrate near-perfect empirical equivalence with ad-hoc Lorentz implementations, but geometric subtleties may matter in extreme or low-sample regimes (Chlenski et al., 4 Jun 2025).

Potential future work includes incorporating alternative split surfaces such as gyroplanes, developing gradient-boosted and deep tree ensembles in hyperbolic geometry, and hybridizing with Bayesian or probabilistic splitting criteria.


Key references: (Doorenbos et al., 2023, Chlenski et al., 2023, Chlenski et al., 4 Jun 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (3)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Hyperbolic Random Forests (HoroRF).