SPL: Navigation Efficiency Metric

Updated 7 December 2025

SPL is a standardized metric that evaluates navigation performance by combining success rate with path optimality through a normalized score.
It penalizes detours by comparing the executed trajectory length with the geodesic shortest-path distance, ensuring only efficient routes score highly.
Empirical studies using SPL have driven advancements in both simulation and real-world navigation benchmarks, influencing algorithm design and performance evaluation.

Success weighted by Path Length (SPL) is a standard scalar navigation efficiency metric used to evaluate embodied and autonomous agents in goal-directed navigation tasks. SPL quantifies both the reliability (success rate) and the efficiency (path optimality) of navigation policies, providing a normalized score that facilitates fair comparison across heterogeneous methods and environments. SPL has become the principal metric for benchmarking classic and learned navigation pipelines in simulated and real-world environments, and is foundational in recent works targeting embodied object-goal navigation and similar scenarios (Chabal et al., 30 Nov 2025, Mishkin et al., 2019).

1. Formal Definition and Computation

SPL is measured over a set of $N$ navigation episodes. For episode $i$ :

$S_i \in \{0,1\}$ : success indicator (1 if the agent reaches the goal within the defined success criteria, 0 otherwise)
$L_i$ : geodesic shortest-path distance from the start to the goal or nearest valid stopping location ("ideal" path length)
$P_i$ : actual trajectory length traversed by the agent ("executed" path)

The SPL metric is given by

$\mathrm{SPL} = \frac{1}{N} \sum_{i=1}^{N} S_i\, \frac{L_i}{\max(P_i, L_i)}$

This formulation caps per-episode contribution at 1, only credits successful episodes, and down-weights episodes with unnecessarily long paths (Mishkin et al., 2019, Chabal et al., 30 Nov 2025, Yokoyama et al., 2021).

Stepwise computation for each episode:

Determine success: $S_i = 1$ if agent terminates within success proximity (e.g., within 1 m of the goal and visibility threshold), else $S_i = 0$ .
Compute ideal distance: $L_i$ via geodesic shortest path (e.g., A*, FMM), often measured on 2D or 3D navigation meshes.
Accumulate executed path length: $P_i$ as sum of Euclidean distances for each movement step; rotations typically not counted unless explicitly specified (see below).
Compute per-episode score: $S_i L_i / \max(L_i, P_i)$ . Zero if unsuccessful.
Average over all episodes for the final SPL.

A numerical example as reported in (Mishkin et al., 2019):

Episode 1: $L_1 = 10$ , $P_1 = 15$ , $S_1=1$ yields $0.667$
Episode 2: $L_2 = 8$ , $P_2 = 12$ , $S_2=0$ yields $0$
Episode 3: $L_3 = 5$ , $P_3 = 5$ , $S_3=1$ yields $1$ Final $\mathrm{SPL} = (0.667 + 0 + 1)/3 \approx 0.556$ .

2. Rationale for SPL and Comparative Advantages

SPL is designed to balance goal-reaching reliability with path optimality. Unlike Success Rate (SR) or mean path length (PL) alone, SPL:

Penalizes Detours: Reaching the goal via sub-optimal routes reduces SPL by the ratio $L_i/\max(P_i, L_i)$ , discouraging circuitous exploration (Chabal et al., 30 Nov 2025).
Credits Only Successful Episodes: SPL is zero for failures, thus not inflated by agents that merely wander extensively.
Normalized and Comparable: SPL is always in $[0,1]$ and supports direct comparison across agents and environment complexities.
Realistic Efficiency Measure: It accounts for real-world requirements such as minimizing energy and time by favoring near-geodesic, successful trajectories.

In embodied navigation contexts, SPL is preferred over SR or pace for its stricter coupling of success with navigation efficiency (Chabal et al., 30 Nov 2025, Mishkin et al., 2019).

3. Implementation Protocols in Benchmarking

General Protocol

Success Threshold: Defined by proximity (typically $\leq$ 1 m) and/or visibility to the goal; agent must execute a STOP action within this region.
Navigation Budget: Maximum allowed steps/episodes (commonly 500 discrete steps) set for each episode.
Trajectory Measurement: Only translational motions are accumulated in $P_i$ unless the protocol includes penalization for rotations (as in FOM-Nav).
Geodesic Computation: Ideal path $L_i$ is precomputed using algorithms like A* or FMM on the environment's obstacle map or ground truth mesh.

FOM-Nav Specifics

Trajectory Length: Includes all forward 25 cm motions; for HM3D v2, only 2D $(x, y)$ displacements are counted due to height annotation inconsistencies.
Rotation: Episodes typically start with a 360° rotation; these steps are added to $P_i$ and penalize SPL if excessive (Chabal et al., 30 Nov 2025).

Table: Success Criteria and Path Calculation (as per FOM-Nav)

Protocol Aspect	FOM-Nav Specification	Standard Benchmark Specification (Mishkin et al., 2019)
Success distance	$\leq 1$ m + visibility	$\leq$ 0.2 m (goal radius)
Path measurement	25 cm translation per step; includes rotations	Only forward translation; rotations ignored
Geodesic estimation	FMM on learned/auto map	A* on ground-truth mesh
Step/time budget	500 steps	500 steps or 50 s

4. Empirical Results and Component Analysis

SPL serves as the principal evaluation metric in recent navigation benchmarks.

Method	MP3D_sub SPL	HM3D v1_sub SPL	HM3D v2 SPL
RIM	15.8	27.6	22.2
PIRLNav	—	34.7	27.0
VLFM*	19.6	37.6	33.0
VLFM†	19.8	39.6	33.6
FOM-Nav	23.9	52.1	47.9

Incremental improvements in SPL can be attributed to architectural modifications. In FOM-Nav, moving from basic to full models resulted in a $+9.7$ SPL improvement, and ablation studies show contributions from explicit object/scene encoding, classical planning, and mixed ground-truth plus auto-generated map data.

5. Limitations and Domain-Specific Caveats

SPL assumes that geodesic path length is a faithful surrogate for true navigation cost. It has notable limitations:

Ignores Rotation and Idle Time: Standard SPL ignores non-translational actions, which may distort efficiency assessments, especially for agents with complex or nonholonomic dynamics (Yokoyama et al., 2021, Mishkin et al., 2019).
Harsh Failure Treatment: Any failure, regardless of proximity to success, contributes zero.
Dependence on Oracle Path: SPL presumes access to $L_i$ , which may not reflect reachable trajectories in dynamic, imperfectly mapped, or real-world scenarios (Mishkin et al., 2019).
Insensitive to Dynamics: For curved-dynamics agents (e.g., unicycle), fastest time paths are not the shortest in distance. SPL can underreport the efficiency of such agents compared to point-turn models (Yokoyama et al., 2021).

Alternative measures, such as Success weighted by Completion Time (SCT), address some of these issues by normalizing to minimum-time trajectories defined via agent dynamics (Yokoyama et al., 2021).

SPL is the canonical metric for navigation tasks in the Habitat, Matterport3D, and HM3D evaluation protocols. Its widespread use supports reproducibility and apples-to-apples comparison across research groups and methodologies (Mishkin et al., 2019, Chabal et al., 30 Nov 2025).

The introduction of SPL has significantly influenced the design of embodied navigation agents, incentivizing methods that reliably reach goals without excessively circuitous behaviors. Contemporary research explores SPL-driven architectural ablations, data policies, multimodal perception pipelines, and hybrid classical-learning solutions for maximizing SPL on standard benchmarks.

Recent works also highlight the need for dynamics-aware or more nuanced efficiency criteria, such as SCT, to address SPL’s insensitivity to non-Euclidean agent motion, rotational inefficiencies, and application-specific energy/time budgets (Yokoyama et al., 2021).

7. References

FOM-Nav: "FOM-Nav: Frontier-Object Maps for Object Goal Navigation" (Chabal et al., 30 Nov 2025)
Anderson et al., "Benchmarking Classic and Learned Navigation in Complex 3D Environments" (Mishkin et al., 2019)
Kahn et al., "Success Weighted by Completion Time: A Dynamics-Aware Evaluation Criteria for Embodied Navigation" (Yokoyama et al., 2021)

Markdown Report Issue Upgrade to Chat

References (3)

FOM-Nav: Frontier-Object Maps for Object Goal Navigation (2025)

Benchmarking Classic and Learned Navigation in Complex 3D Environments (2019)

Success Weighted by Completion Time: A Dynamics-Aware Evaluation Criteria for Embodied Navigation (2021)

Topic to Video (Beta)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Navigation Efficiency Metric SPL.

SPL: Navigation Efficiency Metric

1. Formal Definition and Computation

2. Rationale for SPL and Comparative Advantages

3. Implementation Protocols in Benchmarking

General Protocol

FOM-Nav Specifics

Table: Success Criteria and Path Calculation (as per FOM-Nav)

4. Empirical Results and Component Analysis

Comparative Performance (from FOM-Nav (Chabal et al., 30 Nov 2025))

5. Limitations and Domain-Specific Caveats

6. Influence on Navigation Research and Future Directions

7. References

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

SPL: Navigation Efficiency Metric

1. Formal Definition and Computation

2. Rationale for SPL and Comparative Advantages

3. Implementation Protocols in Benchmarking

General Protocol

FOM-Nav Specifics

Table: Success Criteria and Path Calculation (as per FOM-Nav)

4. Empirical Results and Component Analysis

Comparative Performance (from FOM-Nav (Chabal et al., 30 Nov 2025))

5. Limitations and Domain-Specific Caveats

6. Influence on Navigation Research and Future Directions

7. References

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics