Regulated Pure Pursuit (RPP) in Robotics
- Regulated Pure Pursuit (RPP) is a path-following algorithm that dynamically adjusts lookahead and velocity to improve path-tracking precision and safety.
- It integrates heuristic-based controls in systems like ROS 2 Nav2 and employs reinforcement learning to optimize performance in both constrained and high-speed environments.
- Experimental results show that RPP significantly reduces tracking error and improves lap times, demonstrating robust performance in service robotics and autonomous racing.
Regulated Pure Pursuit (RPP) is a class of path-following algorithms for mobile robots and autonomous vehicles that augments traditional Pure Pursuit and Adaptive Pure Pursuit methods with online regulation of linear velocities to enhance path-tracking precision, safety, and robustness. RPP frameworks employ heuristic velocity modulation or reinforcement learning–based policies to dynamically adjust the lookahead distance and velocity profile, optimizing performance in both constrained and open environments. RPP is implemented as both a heuristic-based controller in the ROS 2 Nav2 stack for general robotics (Macenski et al., 2023), and as a reinforcement learning–guided regulator in aggressive autonomous racing (Elgouhary et al., 30 Mar 2026).
1. Mathematical Principles of Pure Pursuit and RPP
Pure Pursuit (PP) computes steering commands by identifying a lookahead point at arc-length distance ahead on the reference path and generating curvature
where is the lateral offset of the lookahead in the robot’s frame. Traditional PP assumes a fixed lookahead and commands a constant or externally determined speed.
Adaptive Pure Pursuit (APP) generalizes this by scaling the lookahead proportionally to instantaneous speed: with a lookahead time tuning gain. This introduces the ability to trade off between tight path tracking (small ) and smooth operation (large ).
RPP frameworks, beginning with (Macenski et al., 2023), retain APP’s geometric core but introduce velocity regulation. The canonical RPP cycle executes: - Select at control tick - Transform path, compute 0 - Regulate speed via heuristics related to curvature and proximity - Simulate a short-horizon circular arc and stop if a collision is predicted - Issue 1 with 2
2. Heuristic-Based Velocity Regulation and Safety
RPP extends APP by introducing two core heuristics:
Curvature-based Slowdown:
3
with threshold 4 This enforces lower speeds in curves sharper than a tunable minimum radius 5, improving tracking and reducing overshoot.
Proximity-based Slowdown:
6
where 7 is the nearest obstacle distance, 8 defines the slowdown onset, and 9 adjusts the aggressiveness.
The regulated speed is: 0 A forward simulation of the resulting arc verifies collision-freedom. If a collision is predicted within the short time horizon (e.g. 0.5 s), the command is braked to zero (Macenski et al., 2023).
3. Reinforcement Learning–Guided Regulation
A distinct RPP variant leverages reinforcement learning for online, locally optimal adaptation. In "Dynamic Lookahead Distance via Reinforcement Learning-Based Pure Pursuit for Autonomous Racing," a Proximal Policy Optimization (PPO) agent modulates the lookahead distance 1 given a compact state encoding: 2 with 3 denoting multi-horizon curvatures and 4 their difference. The agent outputs
5
which is exponentially smoothed and used by the Pure Pursuit controller (Elgouhary et al., 30 Mar 2026).
The reward function penalizes deviation from an ideal lookahead, progress stalls, large control jumps, and tracks speed, high curvature, and collisions: 6 Reward shaping and regularization stabilize training and incentivize robust zero-shot transfer.
4. Algorithmic Workflow and Implementation
The typical RPP control cycle, in both heuristic and RL-guided variants, comprises:
| Step | Computation | Output |
|---|---|---|
| 1 | Path transform, prune, and localization | Up-to-date path segment |
| 2 | Compute 7 (heur/GNN) and select lookahead | 8 |
| 3 | Curvature calculation, regulation heuristics (and RL action, if applicable) | 9 |
| 4 | Simulate arc, collision-check | Safe 0 |
| 5 | Command 1 to base | Robot actuation |
Heuristic RPP is implemented in ROS 2 Nav2 via a configurable plugin. The same base implementation supports standard PP, APP, and RPP modes through parameterization (Macenski et al., 2023). RL-guided RPP integrates a PPO agent in closed-loop with the geometric controller pipeline, trained in F1TENTH Gym with lookup track features and deployed in both simulation and real hardware (Elgouhary et al., 30 Mar 2026).
5. Experimental Results
Heuristic RPP demonstrates substantial improvements in safety and precision for service-class robots. On Pal Robotics TIAGo, RPP reduced path-tracking error to 2 m in sharp-turn cases (vs. 3 m for PP and 4 m for APP), increased blind-turn stopping margin by 33%, and improved tracking in constrained corridors.
In dynamic, high-speed environments, RL-guided RPP (RL-PP) achieved superior lap times and completion reliability under aggressive speed profiles:
| Scenario | RL-PP [s] | Adaptive PP [s] | Fixed-L [s] |
|---|---|---|---|
| Montreal, 5 speed | 6 | 7 | 8 (needed 9) |
| Yas Marina, 0 speed | 1 | 2 | 3 |
| 1:10 RoboRacer hardware, 10 laps | 4 | — | Failed repeated laps |
These results establish that both heuristic (Macenski et al., 2023) and RL-based (Elgouhary et al., 30 Mar 2026) RPP significantly improve path-tracking precision, obstacle safety, and lap completion beyond traditional methods.
6. Analysis of Adaptive Behavior
Heuristic RPP’s velocity regulation minimizes overshoot in corners, ensures early braking approaching obstacles, and prevents shortcutting in tight spaces. RL-based RPP dynamically exploits the allowable [0.35, 4.0] m lookahead range: expanding 5 on straights for stability and shrinking it in curves for sharper tracking. This automatic adaptation closely matches, but more finely tunes, the intuition behind manual lookahead schedules (Elgouhary et al., 30 Mar 2026).
A plausible implication is that such adaptivity—whether via heuristics or machine learning—enables robust deployment without per-environment retuning. Both approaches serve as a middle ground between geometric transparency and data-driven optimization.
7. Limitations and Future Directions
Current RPP controllers assume a geometric kinematics model without explicit vehicle dynamics; abrupt velocity changes can cause wheel slip on certain platforms (Macenski et al., 2023). Minor "shortcutting" remains present in very sharp turns. RL-based RPP, while effective, lacks full ablations isolating reward and smoothing effects, and its behavior in extreme noise/tight corners is not yet fully characterized (Elgouhary et al., 30 Mar 2026).
Research directions include:
- Adding dynamics-aware velocity ramping or acceleration limits (e.g., 1D MPC overlays)
- Joint optimization of regulation parameters (6, 7) via reinforcement learning
- Extending collision checks for dynamic obstacles and social environments
- Broader benchmarking against other adaptive or learning-based controllers
- Formal ablation of RL state/action/reward design
- Integration of alternative lookahead selection methods ("adjusted lookahead")
References
- "Regulated Pure Pursuit for Robot Path Tracking" (Macenski et al., 2023)
- "Dynamic Lookahead Distance via Reinforcement Learning-Based Pure Pursuit for Autonomous Racing" (Elgouhary et al., 30 Mar 2026)