SnailBot Relative Localization System

Updated 28 December 2025

The paper presents a decentralized sensor fusion framework that integrates UWB, monocular vision, and odometry for accurate relative localization in SnailBot robots.
It leverages a GNN front-end with a differentiable Sinkhorn operator and a pose graph optimization back-end to achieve centimeter–decimeter RMSE accuracy in simulations and real-world tests.
The system demonstrates high robustness against occlusions and sensor noise and scales efficiently for multi-robot swarm deployments.

A relative localization system for SnailBot refers to the integrated sensor fusion, estimation, and optimization architecture that enables each SnailBot module to perceive its position and orientation with respect to other SnailBots in its local environment, independent of global references such as GPS. Such systems are fundamental for modular, collaborative, or swarm robot deployments, where real-time awareness of neighboring relative configurations is essential for coordinated behaviors.

1. Multi-Sensor Architecture and Functional Pipeline

The state-of-the-art SnailBot relative localization framework realizes an end-to-end decentralized architecture combining UWB ranging, monocular vision, and proprioceptive odometry. Each SnailBot is equipped with:

UWB two-way ranging module (e.g., DW1000 or NoopLoop DW1000) providing omnidirectional inter-robot range measurements $d_{ij}$ .
Forward-facing monocular camera (fisheye, 185° FOV) producing 2D pixel detections of neighboring robots; mapping to unit-bearing vectors via camera calibration.
Proprioceptive odometry from wheel encoders or inertial dead-reckoning, delivering a local 3-DoF pose prior $P_i^p$ .

Data fusion is staged in two components:

Graph Match Network (GNN) front-end: Performs soft, uncertainty-aware matching between UWB-derived distances and visually detected bearings, yielding jointly optimal 3-DoF relative position hypotheses $\hat{t}_{j}^{i}$ , with explicit per-pair ( $\Sigma^M_{ij}$ ) and prior ( $\Sigma^P_i$ ) covariance outputs. Assigned matches are produced via a differentiable Sinkhorn network on a similarity matrix constructed from learned latent features. The process is robust to spurious/outlier detections.
Differentiable Pose Graph Optimization (PGO) back-end: Constructs a variable-dimension graph with nodes representing relative SE(3) robot poses and edges for mutual observation (vision+UWB), odometry priors, and pure UWB range constraints. The cost is jointly minimized:

$C(\chi) = \sum_{(i,j)\in E^M} (e^M_{ij})^\top \Sigma^M_{ij}{}^{-1} e^M_{ij} + \sum_{i\in E^P} (e^P_i)^\top \Sigma^P_i{}^{-1} e^P_i + \sum_{(i,j)\in E^r} (e^r_{ij})^2/\sigma_r^2$

with nonlinear residuals defined on 6-DoF pose variables, incorporating uncertainty weights from the front-end (Wang et al., 11 Dec 2025).

Each SnailBot runs this pipeline locally, interleaving high-frequency proprioceptive and lower-frequency peer-pose messages to maintain scalable, O(|E|)-complexity per-iteration communication.

2. Mathematical Foundations and Algorithmic Details

Relative range is modeled as $r_{ij} = \|p_i - p_j\|_2 + \epsilon^r$ , with zero-mean Gaussian noise $\epsilon^r \sim \mathcal{N}(0, \sigma_r^2)$ . Visual detection involves back-projecting 2D image points $u = (u, v)^\top$ via known camera intrinsics $K$ , normalizing $b = K^{-1}[u, v, 1]^\top / \|K^{-1}[u, v, 1]^\top\|_2$ to yield direction in the robot local frame.

The GNN front-end applies message-passing updates:

$m_{ij}^{(l)} = \phi_m^{(l)}(x_i^{(l)}, x_j^{(l)}, e_{ij}), \quad x_i^{(l+1)} = \phi_u^{(l)}(x_i^{(l)}, \sum_{j\in \mathcal{N}(i)} m_{ij}^{(l)})$

Soft assignments are solved by a Sinkhorn operator over the learned score matrix $S_{ij} = \langle x^p_i, x^v_j \rangle$ (Wang et al., 11 Dec 2025).

Back-end PGO fuses all constraints, initializing node positions with front-end output and using front-end covariances to balance the impact of vision vs. UWB vs. odometry.

3. System Implementation and Calibration for SnailBot

On SnailBot, mandatory sensors are UWB tags, a monocular camera, and odometry (preferably wheel-encoder based). Calibration is crucial:

Camera intrinsics $K$ and distortion must be estimated; this is typically achieved via standard SLAM toolkits or custom checkerboard/AprilTag setups.
Extrinsic calibration ( $T_{cam\_tag}$ ), mapping between UWB and camera frames, is accomplished via hand-eye calibration using AprilTag rigs or controlled UWB ranging to a known checkerboard.
If the camera FOV is limited (<120°), the GNN must use deeper message-passing (e.g., $L=6$ ) to compensate for reduced angular coverage.
For slow-dynamics or less frequent updates, Sinkhorn iterations can be reduced (e.g., 50) and the learning rate correspondingly decreased to mitigate overfitting.

All hardware modules must be measured and aligned to $\pm 1\,\mathrm{cm}$ accuracy; residual UWB tag offsets are compensated in software.

4. Empirical Evaluation and Performance Analysis

Extensive simulation and real-world assessments have been performed:

Simulation: Up to 16 robots, 180° FOV, with 40% spurious visual detections. Achieves RMSE of 0.144 m (vs. 0.198 m for a “Simple Match + PGO” baseline) in Sim-16.
Real-world: 5-drone tests (indoor, MCS ground-truth), both LOS and NLOS, achieving RMSE 0.129 m (vs. 0.498 m for vision-threshold baseline).
Failure analysis: Odometry-only back-ends diverge under drift, and simple matching is highly fragile to occlusions. The unified GNN–PGO system is robust across both cluttered and open scenes, and OT/PGO computational overhead remains modest ( $\sim$ 10 ms for GNN, $\sim$ 30 ms for PGO per robot at $N=10$ ).

Best-practice tuning includes increasing vision noise $\sigma^v$ in occluded scenarios or prioritizing vision (decreasing covariance of visual constraints) in UWB-degraded environments (Wang et al., 11 Dec 2025).

5. Comparative Perspective and Methodological Interoperability

The SnailBot GNN–PGO system is distinguished by its ability to:

Aspect	GNN–PGO (Mr Virgil)	Simple Matcher + PGO	Odometry-only
Robust to Occlusion	Yes	No	N/A
Handles Spurious Visual Matches	Yes	No	N/A
Decentralized Op.	Yes	No	Yes
Real-world (NLOS) RMSE (m, 5 drones)	0.129	0.498	Diverges

This approach generalizes earlier distributed methods (e.g., NLLS and trilateration (Cornejo et al., 2013)), providing higher robustness and uncertainty quantification. Unlike certifiably optimal bearing-only SDP relaxations (Wang et al., 2022), this pipeline natively fuses bearing, range, and odometric data, with end-to-end trainable uncertainty integration and support for dynamic, partially connected topologies.

6. Adaptability, Limitations, and Future Directions

The architecture is adaptive to platform constraints: for robots with limited FOV, GNN depth is increased; for low-dynamics environments, update rates and message complexity are reduced accordingly. Key trade-offs exist: under high UWB interference, vision constraints should be trusted more, and vice versa.

A plausible implication is that further gains can be realized by integrating additional sensing modalities (e.g., UWB AoA, radar) or leveraging learned uncertainty maps to make the information weighting more context-sensitive. Extension to full 6-DoF state estimation and tighter integration with task-level planners (formation, coverage) is immediate, given the flexible SE(3)-based PGO back-end.

Overall, the SnailBot relative localization system, as realized in this GNN–PGO paradigm, achieves centimeter–decimeter accuracy in both structured and unstructured environments, is computationally tractable for moderate team sizes, and enables robust, distributed multi-robot operation without reliance on global positioning (Wang et al., 11 Dec 2025).

Markdown Report Issue Upgrade to Chat

References (3)

Mr. Virgil: Learning Multi-robot Visual-range Relative Localization (2025)

Long-Lived Distributed Relative Localization of Robot Swarms (2013)

Bearing-based Relative Localization for Robotic Swarm with Partially Mutual Observations (2022)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Relative Localization System for SnailBot.

SnailBot Relative Localization System

1. Multi-Sensor Architecture and Functional Pipeline

2. Mathematical Foundations and Algorithmic Details

3. System Implementation and Calibration for SnailBot

4. Empirical Evaluation and Performance Analysis

5. Comparative Perspective and Methodological Interoperability

6. Adaptability, Limitations, and Future Directions

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

SnailBot Relative Localization System

1. Multi-Sensor Architecture and Functional Pipeline

2. Mathematical Foundations and Algorithmic Details

3. System Implementation and Calibration for SnailBot

4. Empirical Evaluation and Performance Analysis

5. Comparative Perspective and Methodological Interoperability

6. Adaptability, Limitations, and Future Directions

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research