GeloVec: Geometric CNN Segmentation Model

Updated 16 February 2026

GeloVec is a convolutional neural network framework that uses higher-dimensional geometric smoothing to address boundary instabilities in segmentation.
It integrates a modified Chebyshev distance, orthogonal basis transformation, and adaptive sampling to stabilize feature extraction and preserve object boundaries.
Empirical results show mIoU gains up to 2.7% across benchmarks, demonstrating improved precision, generalization, and computational efficiency.

GeloVec is a convolutional neural network–based framework for semantic segmentation designed to address boundary instability and contextual discontinuities inherent in conventional attention-driven methods. By explicitly modeling the feature space as a higher-dimensional manifold and leveraging advanced geometric smoothing techniques, GeloVec achieves stabilized feature extraction, superior boundary preservation, and intra-class homogeneity in visual segmentation tasks. Its architecture combines a modified Chebyshev distance metric, a multispatial (orthogonal basis) transform, and adaptive sampling weights, grounded in Riemannian geometry, while maintaining computational efficiency and robust generalization across datasets (Kriuk et al., 2 May 2025).

1. High-Level Architecture and Design Objectives

GeloVec extends the U-Net–style encoder–decoder paradigm, employing a ResNet-34 backbone. The primary aims are twofold: stabilize attention maps near object boundaries and retain coherent feature representations within homogenous regions. Conventional CNN-based segmentation often suffers from artifacts at boundaries and fails to maintain region consistency when using pixel-wise operators. GeloVec addresses these by casting activations into a higher-dimensional "feature manifold" and exploiting geometric relationships.

The architecture sequentially integrates four principal modules following each encoding stage:

Orthogonal Basis Transform (OBT): Projects and re-orthogonalizes local descriptors into an expanded basis, enhancing the expressivity of local feature neighborhoods.
Geometric Adaptive Sampling (GAS): Computes a learnable, Chebyshev-style distance field over the higher-dimensional feature space.
Edge Preservation Mechanism (EPM): Gates feature mixing based on geometric distances to prevent cross-boundary information bleeding.
Attention Aggregation: Modulates the standard dot-product attention mechanism using the distance field for improved spatial coherence.

This configuration is applied at four encoding scales: GeloVecLow, GeloVecMid, GeloVecHigh, and GeloVecVeryHigh, before spatial down-sampling. The decoder employs transposed convolutions and refined skip connections to output a $224 \times 224$ binary mask.

2. Geometric Smoothing via Modified Chebyshev Distance

GeloVec's geometric smoothing core is a weighted Chebyshev (ℓ∞) metric in $n$ -dimensional feature space. For each center pixel $p_c$ with features $F_{p_c}$ and its neighborhood $\mathcal N(p_c)$ , GeloVec learns per-offset weights $W_i \in \mathbb{R}^{C'}$ . The weighted Chebyshev distance for neighbor $p_i$ is defined as: $D_{\infty}(p_c,p_i) = \max_{d=1,\dots,C'} \left| [ W_i \odot (F_{p_c} - F_{p_i}) ]_d \right|$ The aggregation operator computes: $D_{\mathrm{norm}}(p_c) = \sigma\left( \text{Conv}_{1\times1} \left( \max_{p_i \in \mathcal N(p_c)} D_{\infty}(p_c,p_i) \right) \right) \in [0,1]$ where $\sigma$ denotes the sigmoid activation. This process yields a robust, locally adaptive distance field, which acts as a smoothness constraint in subsequent processing.

3. Adaptive Sampling Weights and Multispatial Transformation

The sampling weights $W_i$ are initialized uniformly and optimized end-to-end. Each $W_i$ scales the feature channels corresponding to the $i$ th spatial offset, with the ℓ∞ norm extracting the maximal, and thus most salient, channel difference—highlighting spatial boundaries.

Prior to distance computation, OBT projects features into an $n$ -fold expanded, orthonormal basis using a $1\times1$ convolution and channel-wise $\ell_2$ normalization: $X \rightarrow B_\mathrm{proj} \in \mathbb{R}^{B \times (nC') \times H \times W}$

$B_\mathrm{ortho} = \frac{\hat B}{\|\hat B\|_{2,\;\mathrm{channel}}}$

where $\hat B$ is reshaped and normalized to ensure orthogonality along the channel axis. In matrix notation, the $1\times1$ projection matrix $M \in \mathbb{R}^{nC' \times C}$ is constructed with orthonormal rows. These orthogonalized vectors yield an expressive, locally discriminative tensor basis for geometric computations.

4. Riemannian Geometry Foundation

The theoretical underpinning of GeloVec incorporates Riemannian geometry, wherein the feature space manifold $(\mathcal{M}, g)$ possesses a metric $g$ that governs intrinsic distances. While explicit curvature or geodesic equations are not derived, GeloVec's approach—embedding the Chebyshev-based, adaptive distance field $D_{\mathrm{norm}}$ —approximates local geodesic computations: $d_g(x, y) = \inf_{\gamma: [0,1] \to \mathcal{M},\; \gamma(0) = x,\; \gamma(1) = y} \int_0^1 \sqrt{g_{\gamma(t)}(\dot \gamma(t), \dot \gamma(t))}\,dt$ By focusing on the steepest channel-wise difference, the system approximates maximal local metric change, which theoretically stabilizes feature propagation and maintains segmentation fidelity under perturbations. This geometric smoothing is justified as analogous to stability results from the Laplace–Beltrami operator.

5. Parallel Implementation of Geodesic Transformations

Although not accompanied by source code, GeloVec’s processing is structured for efficient parallelization, leveraging standard GPU kernels for convolutions and reductions. The workflow involves the following stages:

Input: X ∈ ℝ^{B×C×H×W}
1. B_proj ← Conv1×1(X)                        # B×(nC')×H×W
2. B_ortho ← reshape(B_proj,B,n,C',H,W)
   B_ortho ← B_ortho / ‖B_ortho‖₂₂(channel)
3. For each spatial position p (in parallel):
     Gather neighbors {p_i} (dilated sample)
     For each offset i (in parallel):
         Δ_i ← W_i ⊙ (B_ortho[:,i,:,p] − B_ortho[:,i,:,p_i])
         D_i ← max_channel(|Δ_i|)
     D_max(p) ← max_i D_i
     D_norm(p) ← sigmoid(Conv1×1(D_max(p)))
4. F_edge ← Conv3×3(B_ortho)                  # edge features
   G_edge ← sigmoid(Conv1×1(D_norm))
   Y_edge ← B_ortho*(1−G_edge) + F_edge*G_edge
5. Q,K,V ← Conv1×1(Y_edge) triplet
   A_raw ← softmax((Q·K^T)/(√d_k − λ·D_norm))
   Y_out ← A_raw · V
Output: Y_out

All steps are parallelized across batch, channel, and spatial dimensions, and exploit standard convolution, dilation, and matrix multiplication kernels.

6. Experimental Results and Comparative Evaluation

GeloVec has been validated on three benchmark datasets:

Caltech CUB-200-2011 (CUB-200)
Large-Scale Dataset for Segmentation and Classification (LSDSC)
Flood Semantic Segmentation Dataset (FSSD)

Metrics included mean Intersection over Union (mIoU), F1 score, Precision, and Recall. GeloVec demonstrated mIoU increases of:

+2.1% on CUB-200
+2.7% on LSDSC
+2.4% on FSSD

These gains were measured against U-Net, DeepLabV3+, HRNet, and SegFormer (MiT-B1). Precision increased by up to 4–5 points, underscoring improved boundary detection and region coherence.

7. Computational Efficiency, Generalization, and Implementation Considerations

GeloVec maintains efficiency by retaining the ResNet-34 encoder and utilizing GPU-native operations: $1\times 1$ convolutions, ℓ∞-norms, and max-over-neighbor reductions. All geometric modules are fusable with standard deep learning kernels. The geometry-aware smoothing mechanism is dataset-agnostic, eliminating the need for hand-tuned edge losses and enabling robust transfer across domains such as bird contour detection and flood boundary mapping.

Practical recommendations include:

Utilizing grouped $1\times1$ convolutions in the OBT to control parameter count.
Precomputing dilated neighbor indices to avoid runtime overhead.
Tuning the $\lambda$ parameter in the attention softmax denominator to interpolate between standard and geometry-modulated attention.

Overall, GeloVec exemplifies a coherent integration of higher-dimensional geometric feature smoothing, orthogonal projection, and attention gating, enabling sharper and more stable segmentation masks without substantial computational overhead (Kriuk et al., 2 May 2025).

Markdown Report Issue Upgrade to Chat

References (1)

GeloVec: Higher Dimensional Geometric Smoothing for Coherent Visual Feature Extraction in Image Segmentation (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to GeloVec.