Dynamic Occupied Space Loss Function

Updated 17 November 2025

The paper introduces a DOS loss that integrates average displacement error with a novel collision penalty based on dynamic occupancy.
The DOS loss adapts a dynamic disk radius from empirical pairwise distances, ensuring realistic collision limits in varied crowd densities.
Empirical evaluations demonstrate that DOS loss reduces collision rates while maintaining displacement accuracy, outperforming static approaches.

The Dynamic Occupied Space (DOS) loss function is a density-sensitive objective designed to improve physical realism and predictive accuracy in pedestrian trajectory forecasting models. DOS loss augments standard displacement-based metrics by explicitly penalizing predicted inter-personal collisions in a manner that adapts to the density and spatial distribution observed in each scene, leading to trajectory predictions that better reflect true spatial constraints across varied crowd contexts.

1. Mathematical Formulation and Structure

The DOS loss function integrates two distinct components: the standard average displacement error (ADE) and a novel collision penalty (CP) sensitive to dynamic occupancy:

$\mathcal{L}_{DOS} = \mathrm{ADE} + \lambda\,\mathcal{CP}$

Where

$\mathrm{ADE}$ is computed as

$\frac{1}{M(T_{pred}-T_{obs})} \sum_{i=1}^M \sum_{t=T_{obs}+1}^{T_{pred}} \sqrt{(x_i^t-\hat{x}_i^t)^2 + (y_i^t-\hat{y}_i^t)^2}$

with $(x_i^t, y_i^t)$ and $(\hat{x}_i^t, \hat{y}_i^t)$ representing true and predicted coordinates of pedestrian $i$ at time $t$ . $M$ is batch size; $T_{obs}$ and $T_{pred}$ are the observed and predicted frame indices.

$\mathcal{CP}$ evaluates collision severity:

$\mathcal{CP} = \sum_{i=1}^M \sum_{t=T_{obs}+1}^{T_{pred}} \sum_{j \neq i} \left[ 1 - \frac{d_{ij}^t}{2\bar R} \right]_{d_{ij}^t < 2\bar R}$

where $d_{ij}^t$ is Euclidean distance between predicted centers of pedestrians $i$ and $j$ at time $t$ , and the penalty is active only for pairs within $2\bar{R}$ .

$\lambda$ specifies the relative weight between displacement and collision penalties and is tuned empirically.

This combined loss enforces both trajectory realism and spatial separation, penalizing close approaches only where contextually meaningful.

2. Occupancy Modeling and Dynamic Disk Radius

The DOS framework models each pedestrian as a circular disk approximating effective “personal space.” Unlike prior methods fixing the disk radius ( $R=0.2$ m), DOS employs a dynamic radius $\bar{R}$ estimated from the empirical distribution of pairwise distances in the ground-truth trajectory batch:

For each predicted frame $t$ , pairwise overlaps are identified:

$\mathcal{O}^t = \{(i,j)\mid i<j,\; \Vert x_i^t-x_j^t\Vert < 2R_{\rm fixed}\}$

Mean half-distance of overlaps at frame $t$ :

$R^t = \frac{1}{2|\mathcal{O}^t|} \sum_{(i,j)\in\mathcal{O}^t} \Vert x_i^t - x_j^t \Vert$

Aggregated dynamic radius for the batch:

$\bar{R} = \frac{1}{T_{pred}-T_{obs}} \sum_{t=T_{obs}+1}^{T_{pred}} R^t$

This adaptive radius is used for all predicted disks, ensuring the occupancy threshold aligns with observed density and spatial proximity, and allowing dynamic calibration of collision sensitivity.

3. Density-Adaptive Collision Penalty

The collision penalty's density adaptation relies on the dynamic radius $\bar{R}$ , directly induced by the empirical spatial distribution. In each batch:

$\bar{R}$ is computed from observed data;
Thresholding for collision ( $\tau=2\bar{R}$ ) adjusts accordingly—shrinking in dense scenes to prevent excessive penalization from unavoidable proximity, expanding in sparse scenes to preserve realistic boundaries.

This mechanism reduces erroneous collision penalties in high-density contexts and enforces interpersonal spacing in low-density scenarios, improving both realism and predictive utility.

4. Hyperparameter Selection and Tuning

The single additional hyperparameter, $\lambda$ , balances ADE and collision penalty. The recommended process involves grid search on held-out validation data with cross-monitoring of both collision rate (CR) and displacement errors (ADE, final displacement error—FDE):

Typical effective $\lambda$ values: $[10^{-3}, 10^{-2}]$
Example tuning outcomes:
- $\lambda=0.01$ for low/medium density
- $\lambda\approx0.002$ for very-high density
- $\lambda\approx0.003$ for mixed density sets

Selection ensures collision minimization does not degrade trajectory accuracy.

5. Training Integration and Implementation

The DOS loss is implemented atop the Social LSTM architecture on the TrajNet++ benchmark, using PyTorch:

Each training batch contains 8 trajectory sequences, with $T_{obs}=9$ observed frames, $T_{pred}=12$ predicted frames.
Observed trajectories encode into LSTM; predictions are decoded.
Post-prediction, $\bar{R}$ is calculated, and DOS loss computed for batch.
Backpropagation is performed jointly through ADE and collision branches via the Adam optimizer (LR=0.001).
Early stopping (patience 5) governs training, which may run up to 15 epochs. Only standard dropout regularization is used.

DOS loss can be integrated with any predictor outputting $(x,y)$ sequences, rendering it model-agnostic with respect to network architecture.

6. Empirical Evaluation and Ablation Analysis

Quantitative evaluation on Festival of Lights Lyon 2022 data spans homogeneous density (lowD, mediumD, highD, veryHD) and heterogeneous density (allD). DOS-Social LSTM is benchmarked against ADE-Social LSTM (ADE only) and TTC-Social LSTM (fixed-radius penalty):

Model	lowD ADE/FDE/CR	mediumD ADE/FDE/CR	highD ADE/FDE/CR	veryHD ADE/FDE/CR
ADE-SLSTM	0.499/0.949/40.6%	0.345/0.671/29.2%	0.241/0.418/33.8%	0.259/0.456/51.3%
TTC-SLSTM	0.469/0.904/37.5%	0.307/0.549/20.6%	0.251/0.435/19.3%	0.319/0.577/38.7%
DOS-SLSTM	0.463/0.876/22.9%	0.323/0.621/12.3%	0.239/0.413/25.5%	0.238/0.420/47.4%

DOS-SLSTM consistently reduces CR (collision rate)—up to 17.7 percentage points in low density—while matching or improving ADE/FDE relative to baselines. On heterogeneous density (allD), DOS-SLSTM with $\lambda=0.003$ achieves ADE=0.248 m, FDE=0.445 m, CR=29.9%, compared to ADE-SLSTM’s 0.257 m/0.473 m/39.3 %. Competing approaches reduce CR but degrade displacement accuracy, especially in dense scenes.

Ablation contrasts the dynamic DOS loss with a static-radius SOS-SLSTM, showing that the static method can lower CR in low densities but raises ADE/FDE in high or heterogeneous contexts; dynamic adaptation is necessary for simultaneous reduction of both collision and displacement errors across all conditions.

7. Practical Guidelines and Broader Implications

$\lambda$ must be tuned on data where collision and displacement are jointly evaluated.
A single batch-level dynamic radius $\bar{R}$ is sufficient; per-pedestrian inference offers negligible additional benefit for typical crowd modeling settings.
DOS loss is modular and can be transplanted into models like Transformers, CVAEs, or any position-sequence predictor.
In extremely dense or mixed-density scenarios (e.g. stadium exits, concerts), scene-adaptive collision penalization is critical for realism; static priors lead to false collisions and degrade accuracy.
This approach enables joint optimization for both trajectory precision and physical feasibility, a necessary criterion for predictive agents in real-world multi-agent environments.

The DOS loss function constitutes a principled, empirically validated technique for ensuring deep pedestrian trajectory predictors are simultaneously accurate and cognizant of realistic spatial constraints, outperforming static penalty approaches in both homogeneous and heterogeneous crowd densities.

PDF Markdown Chat (Pro)

Follow Topic

Get notified by email when new papers are published related to Dynamic Occupied Space Loss Function.