Platoon-based Delay Minimization

Updated 23 November 2025

Platoon-based Delay Minimization is a framework that reduces network, computation, and control delays in vehicular platoons to enhance safety and traffic efficiency.
It employs methods like PSO-based MAC optimization, deep reinforcement learning, and SMDP offloading to achieve measurable improvements in delay metrics, throughput, and fuel consumption.
Hybrid analytical and data-driven models integrate delay-aware control synthesis with dynamic scheduling to ensure stable platooning and effective V2X communications.

Platoon-based Delay Minimization (PDM) addresses the systematic reduction and balancing of network, computation, and control delays within vehicular platoon systems, targeting both safety-critical control loops and broader traffic efficiency objectives. PDM methods span MAC-layer contention optimization for inter-vehicle communications, delay-aware control synthesis for vehicle following and string stability, dynamic scheduling at conflict zones, and end-to-end task offloading in interconnected fog-enhanced architectures. The literature documents both analytical and data-driven PDM frameworks that integrate rigorous network-delay models, optimal scheduling, multi-agent reinforcement learning, and control–communication co-design principles.

1. Objectives and Formal Problem Definition

PDM objectives are formulated differently based on system architecture and delay origin but universally seek to minimize mean, maximum, or variance of delay metrics—typically per-packet delivery time, per-vehicle waiting time, or end-to-end control loop latency:

Inter-platoon MAC delay minimization: Choose per-node contention parameters to minimize average one-hop delay across a backbone platoon, subject to protocol and stability constraints (Wu et al., 2018).
Intersection scheduling PDM: Minimize $\bar D = \frac{1}{N} \sum_{i=1}^N W_i$ where $W_i$ is the service delay or waiting time for vehicle $i$ crossing a conflict zone, under FCFS or reservation policies plus physical conflict constraints (Li et al., 2022, Bashiri et al., 2018).
String stability and control delay minimization: Given plant and communication models (e.g., $u_i(t) = -K_x e_i(t-\tau)$ ), minimize $\tau$ subject to explicit stability bounds, or maximize reliability $R = P[T \leq \Delta_{\max}]$ where $T$ is network delay and $\Delta_{\max}$ determined from control theory (Wang et al., 2020, Zeng et al., 2018).
Task-offloading delay minimization: Select offloading and resource allocation policies $\pi^*$ to minimize mean processing delay or maximize discounted reward reflecting delay in fog/platoon systems, with task arrival, service, and V2X contention dynamics (Wu et al., 2023).

2. Methodological Approaches to Platoon Delay Minimization

2.1 Swarming Optimization for MAC-level Delay

The one-hop MAC delay minimization problem is addressed by a two-step PSO-inspired swarming algorithm:

Step 1: Minimize average one-hop delay by setting a target $D_{\mathrm{target}}$ and iteratively adjusting each vehicle's window size $W_i$ .
Step 2: Balance individual one-hop delays around the minimum achieved in Step 1 via a variance-reduction PSO applied to $W$ .
Parameters: Bounded $W_i \in [1, CW_{\max}]$ , velocity cap $|\Delta W_i| \leq \Delta W_{\max}$ , and standard PSO coefficients.
Results: For $n=6$ , optimized $D_i \approx 3.2$ ms vs. baseline $3.6$–$4.6$ ms, and improvements in end-to-end delay, throughput, and transmission probability (Wu et al., 2018).

2.2 Intersection and Conflict Region Scheduling

At intersections, PDM is formulated as an optimal scheduling problem:

PAIM: Build the cost function $J(s) = \sum_{j=2}^N (D_{p_{[j]}} + B_j)$ with $B_j$ representing cumulative blocking from preceding platoons under safety (no-conflict), headway, and max-waiting time constraints.
Computation: Perform exhaustive/greedy enumeration of all feasible platoon release orders, as feasible when $N\lesssim 8$ .
Simulation: PAIM PDM yields delays $22.7$ s vs $43.3$ s for fixed-time lights under identical throughput, with 13% lower fuel consumption (Bashiri et al., 2018).
Dynamic platoon sizing (DRL-PDM): Use deep Q-networks to adaptively select platoon release size $n \in \{1, ..., N_{\max}\}$ based on real-time MDP state, achieving further delay/fuel reductions and automatic adaptation to traffic (Li et al., 2022).

2.3 Control-Communication Codesign and Delay-Awareness

PDM at the control-communication interface involves:

Delay-dependent stability bounds: Analytical derivation of maximum permissible delay $\tau_{\max}$ for plant and string stability, e.g., $\eta \leq 1/(2\tau)$ and $\lambda \leq K_v K_{v_0}$ , based on closed-loop characteristic equations (Wang et al., 2020, Zeng et al., 2018).
Resource allocation: Given stability-constrained $\tau_{\max}$ , select radio resources (e.g., bandwidth $B$ , antennas $N$ ), computation resources, and tune control gains $(K_x, K_v, ...)$ .
Reliability maximization: Compute reliability $R = P[T \leq \Delta_{\max}]$ using queueing models and SINR distributions; maximize $R$ via joint control–network allocation (Zeng et al., 2018).
Edge-V2I architectures: Optimize handover rate, ensure seamless connectivity via dual-connection, and limit $v_0, M$ to preserve delay contracts (Wang et al., 2020).

2.4 Delay-aware and Delay-minimizing Reinforcement Learning

Recent MARL and CTDE-PDM frameworks explicitly encode delay into state/action spaces:

Augmentation: Use action histories and explicit delays as MDP state, allowing policies (e.g., in CACC) to “plan ahead” for delay, guaranteed by multi-agent delay-aware MDP definitions (Liu et al., 24 Apr 2024, Xu et al., 18 Aug 2025).
Dynamic topology: Multi-key Gated Message Passing (DCT-MARL) adapts V2V communication topology dynamically, weighting neighbor choice for delay robustness (Xu et al., 18 Aug 2025).
Model-based action filtering: Use OVM-based or closed-form velocity control as a safety-ensuring fallback under high or uncertain delay (Liu et al., 24 Apr 2024).
Results: For an 8-vehicle platoon, DAMARL achieves zero or near-zero collisions and the lowest headway/velocity error relative to baselines under $\tau=0.5$ s (Liu et al., 24 Apr 2024).

2.5 Task Offloading and Fog-Enabled Platoon PDM

SMDP-based offloading: The offloading delay is minimized by a state and event-triggered continuous-time SMDP, accounting for random task arrivals, VFC resource arrivals/departures, and 802.11p contention-modeled communication delay.
Optimal actions: Policy $\pi^*(s)$ balances assigning tasks to platoon vehicles (quick service, high cost), to VFC (potentially lower compute delay but higher collision delays), or discarding if resources exhausted.
Adaptivity: The optimal policy shifts task handling mode with changing VFC size, task arrival rate, and collision probability profile, demonstrated to outperform greedy allocation on long-term reward (Wu et al., 2023).

3. Analytical Modeling: Delay Metrics, Constraints, and Stability

3.1 Delay Metrics

Layer/Domain	Delay Metric/Formula	Control/Application
MAC (VANET)	$D_i = T_i / x_i$	One-hop packet delay
Intersection Schedule	$W_i =$ waiting time for vehicle $i$	Travel delay
Control Loop	$\tau_{\mathrm{max}} = \min(\tau_1, \tau_2)$	Plant/string stability
VFC Offloading	$T_{\text{tr}} = \theta T_{\text{slot}} E_{\text{tr}}$	Task roundtrip

3.2 Constraints

Physical: MAC parameter bounds, safety (no headway violation), conflict zone occupation (no overlap).
Network/control: Delay/latency upper bounds for stability, fairness constraints (max delay, waiting time).
Resource: Bounded computing power, available subcarriers, VFC size.
Queueing: Arrival/service rate constraints, collision probabilities.

4. Algorithms and Scheduling Strategies

Approach	Core Mechanism	Representative Reference
Swarming/PSO	PSO adjustment of per-vehicle MAC parameters	(Wu et al., 2018)
Greedy/Exhaustive Search	Complete enumeration of platoon scheduling	(Bashiri et al., 2018)
Deep Reinforcement Learning	DQN-based platoon size and action decisions	(Li et al., 2022)
Delay-aware MARL/CTDE	State–action augmentation, dynamic message passing	(Xu et al., 18 Aug 2025, Liu et al., 24 Apr 2024)
SMDP Offloading	Event-driven semi-Markov policy/value iteration	(Wu et al., 2023)
Analytical MPC	Robust DMPC with predictive scheduling of communication delay	(Hahn et al., 2018)

Numerical Highlights

Swarming MAC optimization: For $n=6$ , one-hop delay reduced from $4.6$ ms to $3.2$ ms; throughput increased from $0.48$ Mbps to $0.60$ Mbps (Wu et al., 2018).
DRL-PDM at intersections: Achieved $69.87$ s mean travel vs. $110.87$ s (fixed), with minimum fuel (89.29 mL/veh) (Li et al., 2022).
Delay-aware MARL: DAMARL achieves zero collisions and robust string stability in all tested scenarios versus multiple baselines (Liu et al., 24 Apr 2024).
MPC with predicted delays: A 30% overall cost reduction allowing for time-varying communication delay (Hahn et al., 2018).

5. Limitations, Trade-offs, and Design Insights

Trade-offs between delay and fairness: Minimum-delay scheduling (exhaustive PFA) may increase per-vehicle variance; batch/gated approaches offer a bias–variance trade-off (Timmerman et al., 2019).
Sensing and communication assumptions: Many methods assume perfect V2X; robustness to losses and actuation uncertainty is a continuing challenge (Li et al., 2022, Xu et al., 18 Aug 2025).
Hardware coupling: MAC-level PDM may be constrained by non-configurable contention window sizes in commercial hardware (Wu et al., 2018).
Scalability: Exact scheduling/enumeration is tractable only for moderate $N$ ; learning-based methods or surrogate optimization are preferred at larger scales (Bashiri et al., 2018, Li et al., 2022).
Dynamic network adaptation: Adaptive schemes, especially MARL agents with topology selection, can respond to time-varying delay and packet loss, a key requirement for practical deployments (Xu et al., 18 Aug 2025, Liu et al., 24 Apr 2024).

6. Applicability and Extensions

PDM frameworks are extensible to:
- Multi-platoon corridors, under more complex topologies and bidirectional flows (Wu et al., 2018).
- Networked or tandem intersections (green waves), requiring global scheduling coordination (Timmerman et al., 2019).
- Integration with task allocation and fog/cloud resource management (Wu et al., 2023).
- Dynamic spectrum/bandwidth allocation and V2I handover management in cellular V2X systems (Wang et al., 2020, Zeng et al., 2018).
Analytical surrogates or fast simulators are required to enable real-time PDM optimization in large-scale, low-latency deployments.

7. Summary Table: Key PDM Strategies

PDM Flavor	System Level	Methodology	Key Metrics/Outcomes	Ref
MAC Delay Minimization	Physical	Two-step PSO	30% one-hop delay reduction	(Wu et al., 2018)
Intersection PDM	Traffic Control	FCFS/greedy/exhaustive	$2\times$ throughput over signals	(Bashiri et al., 2018)
DRL/CTDE PDM	Multi-agent RL	DQN/CTDE/MARL	Min travel time/fuel, robust string stability	(Liu et al., 24 Apr 2024, Xu et al., 18 Aug 2025)
Analytical Control	Control/Comm	Delay-bounded design	Explicit $\tau_{\max}$ , max reliability	(Wang et al., 2020, Zeng et al., 2018)
Task Offloading PDM	Fog Assistance	SMDP policy opt	Max long-term reward, adaptive mode shift	(Wu et al., 2023)

PDM is a multi-disciplinary optimization area that, through formal delay modeling and adaptive scheduling/control/allocation strategies, achieves significant improvements in latency, reliability, and throughput for platooning and cooperative vehicular systems. The ongoing evolution includes deeper integration of real-time learning, cross-layer codesign, and broadening to heterogeneous V2X environments.