Success Weighted by Completion Time (SCT)
- The paper introduces SCT, which combines success and temporal efficiency by computing the ratio T*/max(C, T*) to account for an agent's dynamic constraints.
- It utilizes the RRT*-Unicycle planner to estimate the ideal time T*, ensuring fair comparison by incorporating kinematic and dynamic limitations in navigation tasks.
- Empirical evaluations in simulation and real-world deployments show that SCT more accurately rewards dynamical improvements compared to traditional geometric metrics like SPL.
Success Weighted by Completion Time (SCT) is a dynamics-aware evaluation metric for embodied navigation tasks, introduced to address the limitations of geometric path length metrics such as Success weighted by Path Length (SPL). By explicitly incorporating the agent’s kinematic and dynamic constraints, SCT quantifies navigation performance in terms of the agent’s proximity to time-optimal traversal, given its own dynamics. This metric provides a principled measure for comparing agents with complex or heterogeneous dynamical models, penalizing unnecessary slowness while normalizing across different episodes and environments (Yokoyama et al., 2021).
1. Metric Definition and Theoretical Foundation
SCT assigns a score in to each navigation episode, rewarding both successful completion and temporal efficiency relative to the agent’s best-case dynamics. An episode is successful () if the robot terminates within a success radius of the goal; otherwise, . Let denote the actual completion time (seconds until the agent issues “stop”; if unsuccessful, is capped at ), and denote the fastest possible time to goal afforded by the agent’s dynamics, as computed by a time-optimal motion planner. The formal metric is:
Interpretation:
- If the agent is time-optimal (), SCT=1.
- For , SCT scales inversely (0).
- Unsuccessful episodes (1) receive SCT=0.
- The normalization via 2 ensures comparability across episodes with disparate scales and layouts; a fixed proportion above optimal time yields a consistent SCT irrespective of map size.
2. Comparison to SPL and Geometric Metrics
Traditional metrics such as SPL are defined as:
3
where 4 is the agent’s traversed path length, and 5 is the shortest geometric path (e.g., on a 2D navigation mesh) connecting start and goal. SPL rewards geometric efficiency—segments close to the mathematically shortest route. However, SPL does not penalize agents for dynamical suboptimality—e.g., low velocity, unnecessary pauses, or inefficient acceleration—if the traversed path remains geometrically short. In contrast, SCT directly captures the temporal cost under the agent’s embodiment, including its velocity and turning constraints, and therefore is capable of discerning improvements stemming from more expressive or capable dynamics models (Yokoyama et al., 2021).
3. Computation of the Ideal Time 6: RRT*-Unicycle Methodology
To evaluate 7—the fastest feasible time from given start pose 8 to goal—SCT employs RRT-Unicycle, an adaptation of the RRT sampling-based planner that operates under unicycle dynamics:
- State space: nodes are 9, with 0 (heading) chosen to minimize arrival time.
- Edge cost: minimal travel time between poses, approximated by pivoting (in-place rotation at 1) followed by unicycle-constrained arc traversal at 2 with 3.
- Lookup table: precomputed for 4 pairs determining pivot-arc decompositions minimizing time: 5.
Key algorithmic steps (see summary pseudocode in (Yokoyama et al., 2021)):
- Randomized tree expansion samples free-space or near-fastest-path targets, with rewiring for optimality.
- For arc connectivity, collision checks and minimum travel time computations are applied.
- As sample count 6, the best root-to-goal time approaches the true time-optimal under the kinematic constraints.
A crucial feature is that RRT*-Unicycle accounts for the agent’s action and velocity discretization, yielding a 7 faithful to each agent’s own embodiment.
4. Experimental Paradigms and Empirical Findings
SCT was evaluated in the context of deep RL-based navigation using both simulation and real-world robotic experiments:
- Simulation: Habitat 2.0 platform, Gibson-4+ dataset, with photorealistic indoor environments (72 train, 14 val scenes).
- Agents:
- Point-turn (4 actions: {forward, turn-left, turn-right, stop}; 8, 9),
- Unicycle-6 (6 discrete actions: 2 forward speeds × 3 turn rates),
- Unicycle-15 (15 actions: 3 forward × 5 turn rates).
- Training: PPO (Decentralized Distributed PPO), depth+cognitive inputs, ResNet-50 + 2-layer LSTM encoder, 150M training steps. Reward shaping: “shaped” and “decaying” schedules differing in temporal focus.
- Evaluation: Success rate (S), average SPL, average SCT (per start/goal, using RRT*-Unicycle 0), and intersection (all-agents-success) means 1, 2.
| Agent | Avg. SCT (Shaped) | Avg. SPL (Shaped) | Success (%) (Shaped) |
|---|---|---|---|
| Point-turn | 65.8 | 88.4 | 94.2 |
| Unicycle-6 | 82.2 | 82.8 | 93.2 |
| Agent | Avg. SCT (Decaying) | Avg. SPL (Decaying) | Success (%) (Decaying) |
|---|---|---|---|
| Point-turn | 66.0 | 87.9 | 94.0 |
| Unicycle-6 | 88.9 | 88.1 | 98.8 |
Interpretation: SPL tends to favor agents able to execute tighter geometric turns (e.g., point-turn), but SCT reveals that unicycle-dynamics agents reach goals substantially more quickly (20–30% faster in real time). Intersection average SCT3 emphasizes the temporal advantage of the unicycle, demonstrating SCT's value in revealing underlying dynamical gains (Yokoyama et al., 2021).
5. Real-World Deployment and Generalization
SCT was further validated in real-robot experiments with a LoCoBot platform navigating a two-room apartment. The robot used a 2D LiDAR SLAM map (Hector-SLAM) for 4 computation:
| Agent | Avg. Completion Time (s) | Avg. Path (m) | SCT | SPL |
|---|---|---|---|---|
| Point-turn | 73.5 | 6.81 | 37.4 | 91.1 |
| Unicycle-6 | 47.8 | 6.95 | 57.0 | 89.2 |
A key finding is that the unicycle policy retained in-simulation temporal advantage in zero-shot transfer to real hardware, achieving 535% faster completion and higher SCT, even when the traversed path length was marginally longer (lower SPL) (Yokoyama et al., 2021).
6. Practical Implications and Limitations
- Planner fidelity: The accuracy of 6 is contingent upon the RRT*-Unicycle approximation. Higher sample counts or improved arc-time models are necessary for reliable estimation in complex or expansive environments.
- Embodiment granularity: Discretization of actions constrains how closely the agent can approach the continuous-control 7. Employing a continuous-action unicycle policy may further close the gap.
- Metric nature: SCT requires post hoc planning and global map knowledge (typically available only via SLAM), precluding real-time application as a reward during learning. SCT is strictly evaluative.
- Assumptions: SCT normalization presumes static environments; dynamic obstacles or time-varying cost functions would necessitate methodological extension.
- Sim-to-real gaps: Physical-world factors (friction, sensory noise) may alter actual 8 and 9; recalculating 0 using the real robot’s SLAM map is required for unbiased evaluation.
A plausible implication is that SCT enables more principled comparison of navigation agents with disparate dynamic capabilities, guiding the development and assessment of control policies that fully exploit embodied physical affordances.
7. Summary and Context
Success weighted by Completion Time supersedes geometric path-based measures by replacing “shortest path” with a “fastest time” criterion that is explicitly sensitive to agent dynamics. SCT’s normalized, [0,1]-ranged scores facilitate fair, context-invariant benchmarking of learned policies and planning algorithms, directly rewarding dynamical optimality and encouraging the design of policies tailored to the unique constraints of real mobile robots (Yokoyama et al., 2021).