Average Displacement Error (ADE) Explained

Updated 26 May 2026

Average Displacement Error (ADE) measures the mean positional divergence between predicted and actual trajectories across a prediction window, crucial for model validation.
Applications of ADE include pedestrian movement forecasting, autonomous vehicle navigation, and multi-agent simulations, providing a standard for evaluating predictive accuracy.
Despite its utility, ADE has limitations in multi-agent systems where context recognition and risk significance are critical, suggesting alternative or complementary metrics may be needed.

Average Displacement Error (ADE) is a foundational quantitative metric in trajectory prediction, pedestrian and vehicle forecasting, autonomous driving, safe navigation in GPS-denied environments, and multi-agent interactive systems. It provides a scalar measure of mean positional divergence between a model's predicted path and the true trajectory, assessed over an entire prediction horizon. ADE is ubiquitous in the literature as a primary gauge for model fidelity, yet various works have illuminated critical conceptual and practical limitations, particularly in multi-agent and safety-critical applications.

1. Mathematical Formulation and Computation

ADE quantifies average per-step spatial error between predicted and ground-truth trajectories. In its general form, for a trajectory of length $T$ , the ADE is:

$\mathrm{ADE} = \frac{1}{T} \sum_{t=1}^T \| \hat{\mathbf{p}}_{t} - \mathbf{p}_{t} \|_2$

where $\hat{\mathbf{p}}_{t} \in \mathbb{R}^2$ is the predicted position at time $t$ , $\mathbf{p}_{t}$ is the ground-truth position, and $\|\cdot\|_2$ is the Euclidean norm (Ahmadi et al., 2023, Liu et al., 11 Oct 2025, Tran et al., 2023, Wei et al., 2024, Mohan et al., 5 May 2026, Sapkota et al., 4 May 2025, Ahmad et al., 5 Aug 2025, Mohamed et al., 2022).

Extension to multi-agent and multi-modal settings is standard. For $N$ agents:

$\mathrm{ADE}_{multi} = \frac{1}{N} \sum_{i=1}^N \frac{1}{T} \sum_{t=1}^{T} \| \hat{\mathbf{p}}_{i,t} - \mathbf{p}_{i,t} \|_2$

In probabilistic models, ADE may be computed for each stochastic sample and then further aggregated using a minimum ("Best-of-N") to account for diverse plausible futures (Mohamed et al., 2022, Ahmad et al., 5 Aug 2025).

2. Stepwise Evaluation and Workflow

The canonical ADE computation pipeline encompasses:

Data Preparation: Extract historic and future positions for each agent from dataset (e.g., ETH/UCY, SinD, nuScenes) (Ahmadi et al., 2023, Wei et al., 2024, Liu et al., 11 Oct 2025).
Prediction Generation: Model (LSTM, Transformer, GAN, etc.) outputs a sequence $\{ \hat{\mathbf{p}}_t \}_{t=1}^T$ of predicted (x, y) positions (Ahmadi et al., 2023, Mohan et al., 5 May 2026).
Error Measurement: For each time step $t$ , compute $\mathrm{ADE} = \frac{1}{T} \sum_{t=1}^T \| \hat{\mathbf{p}}_{t} - \mathbf{p}_{t} \|_2$ 0.
Temporal Averaging: Compute per-trajectory mean over $\mathrm{ADE} = \frac{1}{T} \sum_{t=1}^T \| \hat{\mathbf{p}}_{t} - \mathbf{p}_{t} \|_2$ 1 steps.
Dataset-Level Aggregation: Average ADEs over all trajectories, agents, or test cases.
Special Protocols for Multi-Modal Models: For each agent, assign the prediction sample attaining minimum ADE (“minJointADE,” "BoN-ADE") for fair comparison (Ahmad et al., 5 Aug 2025, Mohamed et al., 2022).
Unit Convention: All distances reported in meters in bird’s-eye or ego-centric frames.

3. Reported ADE Values Across Domains

Empirical ADE figures vary widely across domains, datasets, and prediction horizons:

Application / Dataset	ADE (Lower is Better)	Reference
Pedestrian (ETH, UCY)	1.2586–3.6030 m, 6.2% lower vs baseline	(Ahmadi et al., 2023)
Multi-vehicle (SinD, KI-GAN)	0.05 m (6 s), 0.11 m (9 s)	(Wei et al., 2024)
Autonomous Driving	<0.08 m (nominal), 3–4 m under PGD attack	(Mohan et al., 5 May 2026)
Digital Twin Work Zones	0.1327 m (minJointADE, HPNet model)	(Ahmad et al., 5 Aug 2025)
Safe Navigation (real-imit.)	0.2393 m (LanBLoc-BMM-EKF)	(Sapkota et al., 4 May 2025)

Low ADE values (sub-meter) signify highly accurate predictions, while scenarios with adversarial inputs yield dramatic increases, highlighting architectural brittleness (Mohan et al., 5 May 2026).

ADE is typically paired with Final Displacement Error (FDE):

$\mathrm{ADE} = \frac{1}{T} \sum_{t=1}^T \| \hat{\mathbf{p}}_{t} - \mathbf{p}_{t} \|_2$ 2

ADE provides a global, path-wide accuracy assessment, while FDE isolates endpoint precision—critical for downstream planning or collision avoidance. In multi-modal evaluations (e.g., GANs, variational predictors), minADE and minFDE are standard, reporting the error of the best mode among $\mathrm{ADE} = \frac{1}{T} \sum_{t=1}^T \| \hat{\mathbf{p}}_{t} - \mathbf{p}_{t} \|_2$ 3 samples (Mohamed et al., 2022, Wei et al., 2024, Ahmad et al., 5 Aug 2025, Liu et al., 11 Oct 2025).

Extensions to scenario-driven evaluations and alternative risk-weighted metrics (e.g., Average Weighted Risk Score, AWRS (Sapkota et al., 4 May 2025); Average Mahalanobis Distance, AMD (Mohamed et al., 2022)) have emerged to capture safety-critical and uncertainty-aware phenomena beyond displacement-only measures.

5. Strengths and Limitations

Strengths

Intuitive: Directly quantifies mean trajectory error in interpretable spatial units (meters).
Holistic: Aggregates error over the entire forecast, exposing accumulated deviations and the model’s handling of dynamic variations (Ahmadi et al., 2023, Tran et al., 2023).
Broad Applicability: Usable for single- and multi-agent, deterministic and generative models, across diverse domains.

Limitations

Uniform Step Weighting: All time steps contribute equally, which can mask large endpoint errors (motivation for FDE) or critical mispredictions near obstacles (Sapkota et al., 4 May 2025, Liu et al., 11 Oct 2025).
Blind to Uncertainty: Especially in "Best-of-N" formulations, ADE disregards variance and distributional spread, potentially overstating accuracy when only a single mode or sample nears the ground truth (Mohamed et al., 2022).
Context-Oblivious: Fails to distinguish errors occurring in high-risk or scenario-dependent regions (e.g., intersections, merges, curved scenarios); scenario-specific failures may be hidden in aggregate scores (Liu et al., 11 Oct 2025).
Non-informative for Closed-Loop Safety: Static, dataset-based ADE demonstrates poor correlation with real-world driving safety or closed-loop control efficacy, due to the dynamics gap (mismatch between open-loop evaluation and closed-loop performance) (Tran et al., 2023).

6. Identified Biases and Evaluation Controversies

Recent works have challenged ADE as a sole performance indicator:

Distributional Bias in BoN-ADE: Selecting only the minimum error over sampled trajectories renders ADE insensitive to distributional drift or misspecification—entire sample clouds may be misaligned or excessively dispersed despite reporting low ADE for a lucky sample (Mohamed et al., 2022). Average Mahalanobis Distance (AMD) and Average Maximum Eigenvalue (AMV) have been suggested as alternatives to measure centroidal accuracy and predictive spread, respectively.
Unsafe or Blind Regions: Models can achieve competitive ADEs yet fail in semantically critical regions such as unstructured intersections or map-free areas. Scenario-aware decomposition (by semantic context, agent density, road geometry) has been proposed to expose such vulnerabilities (Liu et al., 11 Oct 2025).
Dynamics Gap: ADE on static datasets does not transfer to deployed performance in interactive environments, as the predictor affects the agent’s future, altering the evolution of surrounding traffic or pedestrians (Tran et al., 2023). The introduction of "Dynamic ADE," i.e., ADE in closed-loop simulation or real-time settings, addresses this gap, showing significantly improved correlation to downstream driving quality.

7. Domain-Specific Usage Patterns and Extensions

Autonomous Driving: ADE is ubiquitous in reports of predictive fidelity for planners, trajectory predictors, and imitation learners (Tran et al., 2023, Mohan et al., 5 May 2026). Robustness assessment under adversarial attack further utilizes ADE as a stability indicator.
Pedestrian Motion and Social Forecasting: Integral to benchmarks on ETH/UCY, Social-LSTM, and attention-based architectures; multiclass, multimodal, and attention-weighted models consistently report ADE as a principal metric (Ahmadi et al., 2023, Mohamed et al., 2022).
Safety and Risk-Aware Navigation: While ADE conveys geometric error, its limitations for risk and hazard awareness have driven the adoption of complementary statistics: risk-aware scores (AWRS), joint ADE (minJointADE), and scenario-filtered metrics (Sapkota et al., 4 May 2025, Ahmad et al., 5 Aug 2025, Liu et al., 11 Oct 2025).
Digital Twin and Multi-Sensor Systems: High-fidelity simulation platforms and infrastructure-based sensing enable improvements in ADE through sensor fusion and map constraints (Ahmad et al., 5 Aug 2025).

References

(Ahmadi et al., 2023) Human trajectory prediction using LSTM with Attention mechanism
(Mohamed et al., 2022) Social-Implicit: Rethinking Trajectory Prediction Evaluation and The Effectiveness of Implicit Maximum Likelihood Estimation
(Liu et al., 11 Oct 2025) Beyond ADE and FDE: A Comprehensive Evaluation Framework for Safety-Critical Prediction in Multi-Agent Autonomous Driving Scenarios
(Tran et al., 2023) What Truly Matters in Trajectory Prediction for Autonomous Driving?
(Mohan et al., 5 May 2026) Real-Time Evaluation of Autonomous Systems under Adversarial Attacks
(Sapkota et al., 4 May 2025) SafeNav: Safe Path Navigation using Landmark Based Localization in a GPS-denied Environment
(Wei et al., 2024) KI-GAN: Knowledge-Informed Generative Adversarial Networks for Enhanced Multi-Vehicle Trajectory Forecasting at Signalized Intersections
(Ahmad et al., 5 Aug 2025) Historical Prediction Attention Mechanism based Trajectory Forecasting for Proactive Work Zone Safety in a Digital Twin Environment