SandWorm: Biomimetic Subsurface Navigator
- SandWorm is a biomimetic robotic system that combines screw-actuated peristaltic locomotion with an event-based visuotactile sensor for precise exploration in granular media.
- It features a rigid spiral shell with pushrod actuation and IMU-guided event filtering, yielding significant improvements in locomotion speed and tactile imaging fidelity.
- Deep learning-driven contact mask estimation and robust material classification enable real-time subsurface navigation and efficient pipeline inspection in complex environments.
SandWorm is a biomimetic robotic system designed for navigation and tactile perception in granular media, integrating a screw-actuated peristaltic locomotion mechanism and the SWTac visuotactile sensor. The platform fuses mechanical innovation, event-based sensing, active vibration, and real-time algorithmic filtering for robust operation in environments characterized by unpredictable particle behaviors. Its pipeline includes state-of-the-art tactile imaging, contact mask estimation with deep learning, and feedback-driven locomotion for subsurface exploration and pipeline inspection in complex, field-realistic settings (Li et al., 20 Jan 2026).
1. Mechanical Architecture and Locomotion
SandWorm’s locomotion system leverages a rigid spiral shell described as an “Archimedean screw” with a pitch mm, outer shell diameter 32 mm, and length 80 mm. A brushless DC motor rotates the shell at 60–100 RPM, translating angular displacement into axial motion:
Locomotion is enhanced by an internal pushrod applying alternating force , yielding two phases:
- Extension:
- Retraction:
Here, incorporates gravity effects on inclines. The combined screw–peristalsis action delivers a measured maximum locomotion speed of 12.5 mm/s in a 200 mm-ID pipe—a 62% improvement over screw-only drives. The pushrod stroke is approximately 30 mm at 1 Hz, with accounting for all resistive forces from the medium and boundaries (Li et al., 20 Jan 2026).
2. SWTac Event-Based Visuotactile Sensor
The SWTac sensor integrates an actively vibrated elastomer (PDMS, Sylgard 184, 17:1 mix, Shore 20 A, 1.5 mm thick) with a decoupled event camera, ensuring high-fidelity dynamic and static tactile imaging.
Vibration Isolation
An array of eight lateral springs (stiffness ) and two flexible-shaft couplers () constitute a second-order isolation system for the camera:
Transmissibility is defined as
with measured 83% vibration isolation at 50 Hz.
Elastomer Vibration
Dual actuation is applied: vertical (electromagnetic valve, Hz, µm) and horizontal (offset-mass motors, Hz, –100 µm). Optimal sensor signal-to-noise (MSNR) is observed at these vibration parameters and mid-level event thresholds (Li et al., 20 Jan 2026).
3. Event-Based Imaging, MSNR, and Temporal Filtering
Grayscale Event Reconstruction
Event streams are integrated across 1 ms windows (), discarding polarity , to generate sharp 1 kHz frames:
$G_k(x,y) = \sum_{\substack{e_i\in\mathcal E_{\rm filt}\(x_i,y_i)=(x,y)\t_i\in[T_k,T_k+\Delta T)}} C$
Masked SNR (MSNR)
MSNR evaluates foreground image quality:
IMU-Guided Temporal Filtering
Sensor output quality fluctuates with vibration phase and is modeled as a function of vertical displacement and measured acceleration:
Peak-aligned, bandpass-filtered IMU data is fitted to predict high-quality intervals; only event slices above threshold are retained, resulting in up to 24% MSNR improvement, 46% reduction in MSNR standard deviation, and 1 ms processing latency (Li et al., 20 Jan 2026).
4. Contact Surface Estimation by Deep Learning
Finite-element simulations indicate indenting the elastomer yields asymmetric edge responses (sharp inside, blurred outside). A U-Net architecture processes 256 × 256 event frames () to produce binary contact masks, capitalizing on these edge features.
Network and Training
- Four-level encoder/decoder: 3 × 3 convolution + BN + ReLU, 2 × 2 max-pooling, up-convolution, and skip connections.
- Final 1 × 1 convolution with sigmoid activation for pixelwise mask probabilities.
- Training dataset: 300 hand-annotated images of 12 textures, augmented to 3,000 samples, cross-category hold-out.
- Loss: .
IMU-filtered inference achieves SSIM ≈ 0.969, IoU ≈ 0.81, and RMSE ≈ 0.069 (Li et al., 20 Jan 2026).
5. Tactile and Locomotive Performance
SandWorm’s integrated system demonstrates proficiency on granular and mixed-media tasks.
Tactile Sensing Outcomes
- Texture Resolution: 0.2 mm, enabling recovery of fine board patterns.
- Material Classification: Five stone classes (grit, gravel, pebble, cobble, eggstone) with 98% accuracy using fine-tuned ResNet-18 at 500 Hz.
- Shear Force Estimation: Tip displacement mapped to force via Random Forest; MAE = 0.15 N () (Li et al., 20 Jan 2026).
Locomotion and Task Benchmarks
- Pipeline Inspection: 200 mm-ID, 600 mm in 48 s (12.5 mm/s), with navigational triggers from shear force sensing.
- Obstacle and Bend Navigation: Reliable steering in 15° bends, wall intersections, and 90° elbows (150 mm ID).
- Dredging: Removal of gravel/cobble/eggstone blocks with 90% success in blocked pipeline trials (600 mm in 84–90 s).
Subsurface and Field Performance
- Granular Drilling: 40 trials in beach sand, TPE ( kg/m³), EPP (100 kg/m³), and EPE (10 kg/m³): 36/40 buried objects found (90% success, ≤120 s per trial).
- Field Operations: Effective on grass, bushes, cement, autonomous dredging in mud/leaves/gravel, and recovery of diverse objects (fossils, bottle caps) from natural soil (Li et al., 20 Jan 2026).
6. Significance and Technological Implications
SandWorm exemplifies hardware–software co-design and bio-inspiration (screw plus peristaltic actuation), yielding robust subsurface locomotion and sub-millimeter-level tactile imaging in granular environments previously considered intractable for robotic agents. Co-optimization of vibratory actuation, event-based tactile sensing, IMU-guided event selection, and deep-learning-driven contact reconstruction enables precise, real-time (1 kHz) perception and control at the tip, with field demonstrations validating efficacy across a spectrum of real-world scenarios. A plausible implication is that this architectural fusion could generalize to future bio-inspired robots facing similarly challenging, dynamic contact conditions (Li et al., 20 Jan 2026).