Radial Distance Loss Function

Updated 13 January 2026

Radial Distance Loss Function is a loss that uses Euclidean distance measures alongside angular margins to enforce intra-class compactness and inter-class separation.
It integrates a radial term with angular objectives, providing an extra degree of freedom that enhances feature clustering and mitigates embedding entanglement.
Empirical studies demonstrate significant improvements in classification and geometric modeling tasks, especially in low-dimensional and fine-grained scenarios.

A radial distance loss function is an objective used in machine learning to enforce constraints or regularization based explicitly on Euclidean distance—termed the "radial" component—between learned feature representations and geometric targets (such as class prototypes or surfaces). In contemporary deep learning and implicit neural representation settings, radial distance losses augment traditionally angular or pointwise loss functions to improve class separability, enforce compactness, or facilitate accurate geometric modeling.

1. Mathematical Formulation in Classification: The DistArc Loss

The DistArc loss, introduced in HyperSpaceX, is a prototypical example where the radial distance term is tightly integrated with angular-based objectives for supervised classification on hyperspherical feature spaces. Denoting the feature embedding as $x_i$ , class proxy vectors as $\omega_{r_k}$ , and $y_i$ the label of the $i$ -th example, the loss is defined as [(Chiranjeev et al., 2024), Eq. 1]: $L_{\mathrm{DistArc}} = -\frac{1}{N} \sum_{i=1}^N \log\frac{ \exp\bigl(\cos(\theta_{y_i}+m) + \cos(\phi_{y_i}) - \lambda\,\delta_{y_i}\bigr) }{ \exp\bigl(\cos(\theta_{y_i}+m) + \cos(\phi_{y_i}) - \lambda\,\delta_{y_i}\bigr) + \sum_{j=1,\,j\ne y_i}^{K} \exp\bigl(\cos(\theta_{j}) - \lambda\,\delta_{j}\bigr) }$ The critical component is the penalization by $\delta_{y_i}$ and $\delta_j$ , representing squared Euclidean ("radial") distance between $x_i$ and scaled unit-norm class proxies $\omega_{r_{y_i}}$ , $\omega_{r_j}$ : $\delta_{y_i} = \|\omega_{r_{y_i}} - x_i\|_2^2 \qquad \delta_{j} = \|\omega_{r_{j}} - x_i\|_2^2$ where $\omega_{r_k} = r_k \hat{\omega}_k$ with learnable $r_k$ (class radius) and unit $\hat{\omega}_k$ (class direction).

This structure couples classic angular discriminability (ArcFace-style margin, resultant alignment) with an explicit radial term weighted by $\lambda$ .

2. Mechanisms and Theoretical Significance of the Radial Term

The radial distance penalty introduces an explicit mechanism to cluster embeddings at fixed radii for their classes, orthogonal to angular separation.

Intra-class compactness: By minimizing $\delta_{y_i}$ , all features of class $k$ are pulled to lie close to their class shell of radius $r_k$ in hyperspherical feature space.
Inter-class separation: Simultaneously, $-\lambda\,\delta_j$ in the denominator of the DistArc loss pushes features away from all other class shells.
Additional degree of freedom: Radial separation supplements angular mechanisms, yielding less entangled, more clearly separated clusters even when angular-only approaches saturate due to high class density.

These properties are visualized in the referenced figures (e.g., 1 and 3c) as class clusters inhabit distinct shells and angular sectors, rather than competing solely for limited angular space (Chiranjeev et al., 2024).

A plausible implication is that in low-dimensional or highly entangled scenarios, the radial term offers effective class disentanglement unattainable by angular-only objectives.

3. Hyperparameterization and Implementation

Effective use of radial distance loss terms requires precise tuning of associated hyperparameters [(Chiranjeev et al., 2024), Section 4]:

$\lambda$ (Radial penalty weight): Controls the strength of radial binding and separation. Empirical best practices are dataset-dependent: $\lambda=0.003$ for MNIST-like datasets; $\lambda=0.005$ for CIFAR; progressively increased for face recognition benchmarks.
$r_k$ (Class radius): Each class occupies its own shell, with $r_k$ being learnable and initialized uniformly. This maximizes shell separation potential.
Angular margin $m$ (in $\cos(\theta+m)$ ): Retains traditional inter-class angular discriminability.

The squared Euclidean calculation for $\delta$ is straightforward to implement in auto-diff frameworks: $\delta = \|\omega_{r_k} - x_i\|_2^2$ This simplicity makes the approach computationally efficient and robust with respect to network architecture.

4. Empirical Results and Applications

In the HyperSpaceX framework, empirical evaluation across seven object classification and six face recognition datasets demonstrates the impact of introducing a radial loss term [(Chiranjeev et al., 2024), Tables 2–3, 8]. Notable results include:

Dataset	Method	Accuracy (%)
CIFAR-10	Angular + Radial	96.03
Tiny-ImageNet, 2D	DistArc	20.89
CUB-200, 2D	DistArc	48.11
CUB-200, 2D	Angular-only	$\leq$ 27.77

Ablation studies show incremental gains from adding $\delta$ to angular losses (e.g., 95.16% $\rightarrow$ 95.31% with $\cos\theta+\cos\phi$ , up to 96.03% with all terms).
Radial loss yields pronounced gains in very low-dimensional embedding scenarios and on fine-grained, high-class-count tasks, supporting its role in mitigating angular entanglement.

A plausible implication is that multi-shell hyperspherical arrangements, as enabled by radial loss, become increasingly advantageous as the number of classes grows or when embedding dimension is strictly limited.

5. Radial Distance Losses in Geometric Learning and Implicit Representations

Radial distance functions are also fundamental in geometric learning. The PHASE loss for implicit neural representations (INRs) is an instance where the loss is linked to the signed distance from a surface (Lipman, 2021).

In PHASE, the signed distance function is recovered via a logarithmic transform $w_\epsilon(x) = -\sqrt{\epsilon} \operatorname{sign}(u(x)) \log(1 - |u(x)|)$ , where $u$ is a phase-field network output. As $\epsilon \to 0$ , $w_\epsilon(x)$ converges to the signed distance to the surface.
Distance-based loss components are constructed to regularize $u(x)$ to sharp, minimal-area surfaces by minimizing double-well potential and penalizing gradient deviation.
The variational analysis shows the limiting behavior enforces occupancy interpolation and minimal perimeter for the reconstructed geometry, with the distance function property being a direct consequence (Lipman, 2021).

This framework illustrates the flexiblity of distance-based losses beyond classification, into unsupervised geometric modeling and surface reconstruction.

6. Comparative Analysis and Interpretational Context

Radial distance loss functions differ qualitatively from:

Purely angular (cosine-based, margin-based) losses, which saturate when angular separation is limited by class density.
Occupancy or binary losses (e.g., cross-entropy or BCE in INR applications), which may fail to sharpen interpolation or induce minimal-area bias.
Eikonal or SDF-based losses (enforcing $|\nabla u|=1$ ), which focus on gradient consistency without explicit perimeter minimization.

A plausible implication is that hybridizing radial with angular or phase-field losses, as in DistArc or PHASE, grants a richer inductive bias, favoring both separability and structural fidelity.

7. Significance and Future Directions

Radial distance loss functions systematically address limitations of angular-only supervision by opening new axes for disentanglement, compactness, and geometric precision. The explicit radial penalty brings substantial gains in cluster purity and inter-class partitioning, especially in low-dimensional, large-scale, or fine-grained classification, and provides robust geometric regularization for implicit neural representations. As datasets and problem domains continue to scale in complexity, the principled integration of radial (distance) components—potentially in combination with angular, SDF, or phase-field energies—is likely to remain central in the design of discriminative and generative learning objectives (Chiranjeev et al., 2024, Lipman, 2021).