Safe Local Exploration for Replanning

Updated 26 November 2025

The paper presents a framework that integrates real-time mapping, conservative trajectory optimization, and information-gain metrics to safely navigate cluttered, unknown spaces.
It employs collision costs derived from Euclidean Signed Distance Fields to enforce strict safety while dynamically adjusting plans to overcome local minima.
Empirical results demonstrate high success rates and shorter paths, validating the method's effectiveness for onboard, resource-constrained mobile robots.

Safe local exploration for replanning refers to the class of methodologies enabling mobile robots—often under nontrivial dynamic constraints—to efficiently navigate cluttered, unknown environments while maintaining robust safety guarantees, typically through the real-time integration of mapping, trajectory optimization, and an active local exploration strategy. These systems continuously update their environment models (often using occupancy grids or distance fields from onboard sensors) and adapt their plans to aggressively but safely advance toward global goals, resolving local minima by deploying information-gain-driven intermediate goals or exploration actions. Replanning cycles are tightly coupled to perceptual updates and safety checks, resulting in closed-loop algorithms that operate with bounded latency onboard resource-constrained vehicles, such as micro-aerial platforms in dense forest or office environments (Oleynikova et al., 2017).

1. Conservative Trajectory Optimization and Collision Avoidance

Modern safe local exploration frameworks generally rely on trajectory optimization-based local planners that embed safety directly into both cost function and constraints. In (Oleynikova et al., 2017), the state of a micro-aerial vehicle (MAV) is modeled as a triple-integrator system $\mathbf{x}(t) = [\mathbf{p}(t), \mathbf{v}(t), \mathbf{a}(t)]^\top \in \mathbb{R}^9$ , controlled via jerk $\mathbf{u}(t)=\mathbf{j}(t)\in\mathbb{R}^3$ with hard bounds on velocity, acceleration, and jerk. The objective function incorporates:

Smoothness via a high-order derivative penalty (jerk/snap).
Collision cost $c(\mathbf{p}(t))$ derived from the Euclidean Signed Distance Field (ESDF), with unknown space outside a clearing radius $r_c$ treated as occupied.
A soft goal cost to enable feasible endpoint relaxation.

Collision avoidance is achieved by constructing the ESDF onboard from streaming depth or stereo data. Trajectories are penalized for proximity to both observed obstacles and unknown space, which enforces strict safety—no trajectory ever enters unobserved volume, and the gradient $\nabla c(\mathbf{p})$ steers the optimizer away from risk. The approach remains dynamically feasible, with explicit bounds on velocity and acceleration, and is validated running fully onboard at 4 Hz (Oleynikova et al., 2017).

2. Local Minima and Active Local Exploration

Safe local exploration planners must address local minima in trajectory optimization—a common obstacle in cluttered environments. The system in (Oleynikova et al., 2017) detects such minima when the optimizer fails to find a collision-free spline. It then invokes a local exploration module that selects intermediate goals using an information-gain metric:

Information gain $l(\mathbf{x}, \gamma)$ quantifies the number of unknown voxels visible in the camera frustum at a candidate viewpoint $\mathbf{x}$ and yaw $\gamma$ .
A combined reward $R(\mathbf{x}_i, \gamma)$ balances information gain and proximity to the global goal.

Intermediate goal selection is performed by random sampling within a local sphere and maximizing the combined reward, then replanning toward the selected goal. This mechanism efficiently escapes local minima, avoids oscillation, and exposes new free space, as shown by increased success rates and shorter paths compared to single-step or optimistic global planners (Oleynikova et al., 2017).

3. Information-Gain–Driven Exploration Strategies

A prominent theme across frameworks is the direct coupling of trajectory generation to information gain from mapping. In (Oleynikova et al., 2017), maximizing the number of new observed voxels drives intermediate goal selection. RAPTOR (Zhou et al., 2020) further augments trajectory optimization with explicit perception-aware objectives: expected information gain is integrated over candidate state and camera/yaw trajectories, active yaw planning maximizes the visibility of unknown regions, and risk-aware refinement penalizes proximity to uncertain voxels via an accumulated risk metric over the planned path. Utility-per-cost functions, as in (İşleyen et al., 12 Mar 2025), maximize actionable information (volume or entropy of new observations) per unit navigation cost, yielding more efficient coverage and provably finite-time guarantees under consistent mapping assumptions.

4. Explicit Safety Guarantees and Certification

Multiple approaches provide formal safety certificates. Frameworks based on reachability analysis (Fridovich-Keil et al., 2018) constrain the system to operate within a precomputed invariant set (safety kernel) constructed via the Hamilton–Jacobi PDE, maintaining backward reachability to the initial state under all disturbance realizations. This guarantees recursive feasibility and collision avoidance—every trajectory lies within the set of states from which a safe return to home is possible at all times. Probabilistic methods extend this by certifying collision safety using high-confidence Bayesian estimates (e.g., GP-based safe set estimation in SaGeMPC (Prajapat et al., 9 Feb 2024)), belief-space planners with tunable collision probability thresholds (Pairet et al., 2020), or by bounding risk integrals in optimization (Zhou et al., 2020).

5. Real-Time Mapping, Planning, and Replanning

Robust safe exploration frameworks are characterized by closed-loop, high-frequency (4–20 Hz) mapping and planning cycles. In (Oleynikova et al., 2017), onboard mapping with TSDF/ESDF is refreshed every cycle; RAPTOR (Zhou et al., 2020) achieves <15 ms per replanning iteration, including homotopy-guided multi-path generation, B-spline optimization, and risk refinement; submap-based volumetric approaches (Schmid et al., 2020) integrate temporally local sliding windows, spatially local aggregate TSDF, and pose-graph-optimized global submaps to maintain safety under odometry drift. Replanning is adaptive: each new sensor update triggers recomputation of feasible trajectories, intermediate goal selection, and safety checks as dictated by recent observations.

6. Empirical Results and Performance Benchmarks

Experimental validation confirms the efficacy of conservative local exploration for safe replanning. In simulated cluttered forests (Oleynikova et al., 2017), the conservative trajectory optimizer plus local exploration module achieves ≥90% success in moderate to high-density scenarios, outperforming pure local and optimistic planners, with up to 35% shorter paths. Real-world dense forest flights demonstrate onboard mapping and replanning at 4 Hz with zero collisions. Similarly, (Zhou et al., 2020) reports 99.8% replanning success and collision rate <0.2% over 1000 trials, with 30% faster mean replanning and lower risk than baseline planners. Submap-based approaches (Schmid et al., 2020) sustain 100% collision-free operation even under severe odometry drift, with significant improvements in volumetric coverage efficiency. These empirical metrics indicate that safe local exploration methods robustly navigate dense, unknown environments, maintain safety, and achieve high coverage rates in resource-constrained, real-time deployments.

7. Summary and Implications

Safe local exploration for replanning integrates conservative, dynamically feasible trajectory optimization with active information-gain-driven goal selection to guarantee collision avoidance and efficient mapping progress in unknown cluttered environments. The strict treatment of unknown space as forbidden, rapid onboard mapping, adaptive intermediate-goal selection, and explicit, often formal, safety certification collectively yield robust performance under diverse motion, sensing, and environment uncertainty models. This approach is empirically validated across MAVs, ground robots, and underwater vehicles, and is extensible to complex nonlinear dynamics and learning-based policies (Oleynikova et al., 2017). A plausible implication is that the continued evolution of real-time onboard safe exploration frameworks is central to the deployment of autonomous agents in complex, unstructured environments requiring robust guarantees of safety and coverage.