Dynamic Occupied Space Loss Function
- The paper introduces a DOS loss that integrates average displacement error with a novel collision penalty based on dynamic occupancy.
- The DOS loss adapts a dynamic disk radius from empirical pairwise distances, ensuring realistic collision limits in varied crowd densities.
- Empirical evaluations demonstrate that DOS loss reduces collision rates while maintaining displacement accuracy, outperforming static approaches.
The Dynamic Occupied Space (DOS) loss function is a density-sensitive objective designed to improve physical realism and predictive accuracy in pedestrian trajectory forecasting models. DOS loss augments standard displacement-based metrics by explicitly penalizing predicted inter-personal collisions in a manner that adapts to the density and spatial distribution observed in each scene, leading to trajectory predictions that better reflect true spatial constraints across varied crowd contexts.
1. Mathematical Formulation and Structure
The DOS loss function integrates two distinct components: the standard average displacement error (ADE) and a novel collision penalty (CP) sensitive to dynamic occupancy:
Where
- is computed as
with and representing true and predicted coordinates of pedestrian at time . is batch size; and are the observed and predicted frame indices.
- evaluates collision severity:
where is Euclidean distance between predicted centers of pedestrians and at time , and the penalty is active only for pairs within .
- specifies the relative weight between displacement and collision penalties and is tuned empirically.
This combined loss enforces both trajectory realism and spatial separation, penalizing close approaches only where contextually meaningful.
2. Occupancy Modeling and Dynamic Disk Radius
The DOS framework models each pedestrian as a circular disk approximating effective “personal space.” Unlike prior methods fixing the disk radius ( m), DOS employs a dynamic radius estimated from the empirical distribution of pairwise distances in the ground-truth trajectory batch:
- For each predicted frame , pairwise overlaps are identified:
- Mean half-distance of overlaps at frame :
- Aggregated dynamic radius for the batch:
This adaptive radius is used for all predicted disks, ensuring the occupancy threshold aligns with observed density and spatial proximity, and allowing dynamic calibration of collision sensitivity.
3. Density-Adaptive Collision Penalty
The collision penalty's density adaptation relies on the dynamic radius , directly induced by the empirical spatial distribution. In each batch:
- is computed from observed data;
- Thresholding for collision () adjusts accordingly—shrinking in dense scenes to prevent excessive penalization from unavoidable proximity, expanding in sparse scenes to preserve realistic boundaries.
This mechanism reduces erroneous collision penalties in high-density contexts and enforces interpersonal spacing in low-density scenarios, improving both realism and predictive utility.
4. Hyperparameter Selection and Tuning
The single additional hyperparameter, , balances ADE and collision penalty. The recommended process involves grid search on held-out validation data with cross-monitoring of both collision rate (CR) and displacement errors (ADE, final displacement error—FDE):
- Typical effective values:
- Example tuning outcomes:
- for low/medium density
- for very-high density
- for mixed density sets
Selection ensures collision minimization does not degrade trajectory accuracy.
5. Training Integration and Implementation
The DOS loss is implemented atop the Social LSTM architecture on the TrajNet++ benchmark, using PyTorch:
- Each training batch contains 8 trajectory sequences, with observed frames, predicted frames.
- Observed trajectories encode into LSTM; predictions are decoded.
- Post-prediction, is calculated, and DOS loss computed for batch.
- Backpropagation is performed jointly through ADE and collision branches via the Adam optimizer (LR=0.001).
- Early stopping (patience 5) governs training, which may run up to 15 epochs. Only standard dropout regularization is used.
DOS loss can be integrated with any predictor outputting sequences, rendering it model-agnostic with respect to network architecture.
6. Empirical Evaluation and Ablation Analysis
Quantitative evaluation on Festival of Lights Lyon 2022 data spans homogeneous density (lowD, mediumD, highD, veryHD) and heterogeneous density (allD). DOS-Social LSTM is benchmarked against ADE-Social LSTM (ADE only) and TTC-Social LSTM (fixed-radius penalty):
| Model | lowD ADE/FDE/CR | mediumD ADE/FDE/CR | highD ADE/FDE/CR | veryHD ADE/FDE/CR |
|---|---|---|---|---|
| ADE-SLSTM | 0.499/0.949/40.6% | 0.345/0.671/29.2% | 0.241/0.418/33.8% | 0.259/0.456/51.3% |
| TTC-SLSTM | 0.469/0.904/37.5% | 0.307/0.549/20.6% | 0.251/0.435/19.3% | 0.319/0.577/38.7% |
| DOS-SLSTM | 0.463/0.876/22.9% | 0.323/0.621/12.3% | 0.239/0.413/25.5% | 0.238/0.420/47.4% |
DOS-SLSTM consistently reduces CR (collision rate)—up to 17.7 percentage points in low density—while matching or improving ADE/FDE relative to baselines. On heterogeneous density (allD), DOS-SLSTM with achieves ADE=0.248 m, FDE=0.445 m, CR=29.9%, compared to ADE-SLSTM’s 0.257 m/0.473 m/39.3 %. Competing approaches reduce CR but degrade displacement accuracy, especially in dense scenes.
Ablation contrasts the dynamic DOS loss with a static-radius SOS-SLSTM, showing that the static method can lower CR in low densities but raises ADE/FDE in high or heterogeneous contexts; dynamic adaptation is necessary for simultaneous reduction of both collision and displacement errors across all conditions.
7. Practical Guidelines and Broader Implications
- must be tuned on data where collision and displacement are jointly evaluated.
- A single batch-level dynamic radius is sufficient; per-pedestrian inference offers negligible additional benefit for typical crowd modeling settings.
- DOS loss is modular and can be transplanted into models like Transformers, CVAEs, or any position-sequence predictor.
- In extremely dense or mixed-density scenarios (e.g. stadium exits, concerts), scene-adaptive collision penalization is critical for realism; static priors lead to false collisions and degrade accuracy.
- This approach enables joint optimization for both trajectory precision and physical feasibility, a necessary criterion for predictive agents in real-world multi-agent environments.
The DOS loss function constitutes a principled, empirically validated technique for ensuring deep pedestrian trajectory predictors are simultaneously accurate and cognizant of realistic spatial constraints, outperforming static penalty approaches in both homogeneous and heterogeneous crowd densities.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days free