Congestion Control (CC) Overview
- Congestion control is a set of algorithms and mechanisms that manage packet flows to prevent network overload and ensure efficient, fair resource usage.
- It encompasses classical, hybrid, and learning-based approaches applied across Internet, datacenter, wireless, IoT, and satellite networks.
- Recent advances integrate data-driven techniques such as reinforcement learning and multi-objective optimization to dynamically balance throughput, delay, and fairness.
Congestion control (CC) encompasses a broad family of algorithms, mechanisms, and frameworks that enable distributed endpoints and/or network elements to regulate packet injection rates in response to time-varying resource contention, thereby preventing persistent overloads, minimizing queuing delays, and enforcing fair utilization of shared network links. CC is fundamental to robust data transport, efficiency, and quality of service across the Internet, data centers, wireless networks, IoT, and satellite constellations. Diverse approaches have emerged, including traditional rule-based protocols, explicit signaling, cross-layer feedback, and, recently, data-driven and multi-objective learning-based paradigms.
1. Fundamental Congestion Control Principles
At its core, congestion arises when aggregate packet arrival rates transiently exceed the capacity of downstream links or buffers, producing queuing and, if unmitigated, loss or network collapse. Classical end-to-end CC (e.g., TCP Reno, Cubic) employs indirect signals—packet loss, explicit congestion notifications (ECN), and/or increased round-trip time (RTT)—to infer congestion and modulate transmission windows or rates according to fixed rules, typified by Additive Increase Multiplicative Decrease (AIMD). Delay-based approaches (e.g., Vegas) react to early signs of queuing but struggle in presence of RTT measurement bias or aggressive cross-traffic. Hybrid techniques such as BBR model bottleneck bandwidth and minimum RTT to steer pacing, but may induce transient unfairness and slow convergence in face of sudden network changes (Tafa et al., 2021).
Contemporary CC objectives are multi-faceted: maximize throughput, minimize delay and loss, and ensure fairness among heterogeneous flows. However, no single point in this trade-off space is universally optimal. This has motivated the development of multi-objective and application-driven CC frameworks that can dynamically balance competing criteria based on explicit utility functions or learned policies (Zhang et al., 2021, Ma et al., 2021, Xing et al., 11 May 2025).
2. Classical, Hybrid, and Data Center-Specific Mechanisms
Classical Internet CC includes:
- Loss-based protocols: TCP Reno and Cubic probe bandwidth aggressively, increasing window sizes until loss occurs, then backoff. They offer high throughput on well-provisioned links but conflate congestion and random loss, leading to poor wireless/IoT performance (Tafa et al., 2021).
- Delay-based protocols: Vegas and Copa detect incipient queuing via RTT inflation, adapting conservatively to avoid bufferbloat but often yield bandwidth when competing with loss-based flows.
- Hybrid protocols: BBR and DCQCN attempt to control both bandwidth and delay, using model-based or explicit feedback (e.g., ECN marking in DCQCN for RDMA) to modulate sending rates and provide injection throttling in high-speed datacenter fabrics (Olmedilla et al., 7 Nov 2025).
Recent refinements in high-speed datacenter and supercomputer CC include:
- Enhanced DCQCN variants incorporate more accurate congestion detection (e.g., marking only packets that drive queue growth), selective feedback aggregation, and per-flow severity-aware rate updates to minimize unnecessary throttling and drastically reduce control overhead (Olmedilla et al., 7 Nov 2025).
- In-network learning-based schemes: Solutions such as GraphCC distribute Graph Neural Network (GNN) agents across switches, which cooperate in real-time to set local ECN marking thresholds and maximize aggregate utility (throughput, low buffer occupancy) under dynamic, failure-prone conditions (Bernárdez et al., 2023).
3. Learning-Based and Multi-Objective Congestion Control
Learning-based CC has redefined the design space via reinforcement learning (RL), multi-armed bandit frameworks, and hybrid rule/learned controllers (Jiang et al., 2020, Tafa et al., 2021, Zhang, 2020). The following archetypes have been established:
- Offline RL-trained policies: Agents learn mappings from state vectors (e.g., throughput, delay, loss, current window) to rate or window updates by optimizing over simulated traces, then export a fixed model for online inference. Example: Sage (Mazilu et al., 29 Oct 2025).
- Online utility-gradient frameworks: Agents periodically probe sending rates and update based on direct feedback from measured (e.g., application-parameterized) utilities, as in PCC Vivace or Hercules (Cohen et al., 18 May 2025, Rozen-Schiff et al., 2024).
- Multi-objective RL: Frameworks such as MOCC and DeepCC are conditioned on vectors capturing throughput/latency/loss (e.g., preference or goal vectors), and output Pareto-optimal or requirement-satisfying actions per the presented trade-offs (Ma et al., 2021, Zhang et al., 2021). DeepCC (Zhang et al., 2021) uses an offline-trained DDPG actor–critic agent that, at runtime, interpolates over a policy surface parameterized by user-specified trade-off vectors and can fine-tune online to meet hard absolute application requirements.
- Cross-layer and multi-agent RL: Architectures such as MACC split the control task between agents at the transport (e.g., TCP sender) and network (e.g., AQM router queue) layers, jointly optimizing via value-decomposition and centralized training for distributed execution (Bai et al., 2022).
Recent work—e.g., ASC (Xing et al., 11 May 2025)—explicitly decouples congestion control objectives from the policy model, allowing per-application or per-flow customization (e.g., latency, jitter, or a weighted combination), and leverages rapid fine-tuning and client-server deployment for scalability.
4. Performance Evaluation, Benchmarking, and Adversarial Validation
Rigorous and standardized evaluation of CC schemes remains a primary challenge (Abbasloo, 2023). Two principal axes:
- Single-flow efficiency (throughput/delay): Metrics such as (with = delivery rate, = RTT) quantify the ability to fill a pipe at minimal latency. Top performers in recent benchmarks include RL-driven and delay-based schemes (Orca, Indigo, Vegas, Copa) (Abbasloo, 2023).
- TCP-friendliness and fairness: The deviation from fair-share rate, , is central in multi-flow scenarios. Cubic, BBRv2, and C2TCP maintain the highest winning-rates here.
No scheme achieves 100% winning rate. ML-based and delay-based controllers often excel in one axis but exhibit brittle generalization or are starved by loss-based flows in adversarial or mixed-regime settings. For robustness, automated adversarial testing frameworks such as CC-Fuzz systematically generate packet patterns to expose controller pathologies, e.g., BBR’s permanent stall bug or Reno’s classic “shrew attack” vulnerability (Ray et al., 2022).
Comprehensive benchmarking suites (e.g., CC-Bench1, CC-Bench2) formalize test topologies, parameter sweeps, and winner-selection rules to drive fair and reproducible comparisons and guide the field toward standardized evaluation (Abbasloo, 2023).
5. Specialized Contexts: Wireless, IoT, Satellite, and Datacenter Congestion Control
Wireless (cellular, IoT):
Wireless networks introduce random loss, variable capacity, and time-varying cross-traffic. Classical designs suffer excessive backoff and unused bandwidth due to spurious loss signals. Endpoint-centric techniques such as PBE-CC extract real-time physical-layer bandwidth estimates (e.g., 5G PRB allocation) and directly pace senders accordingly, eclipsing BBR and CUBIC in throughput and latency, especially under rapid mobile capacity changes (Xie et al., 2020). IoT and constrained-device scenarios motivate ultra-lightweight adaptive algorithms, exemplified by CoCoA+ (adaptive RTO/RTO aging) for CoAP, which outperforms static backoff in high-loss, high-density sensor topologies (Sanaboina et al., 2023).
Satellite/LEO constellations:
LEO networks confront fast RTT swings and frequent handovers; RL-based schemes—even fairness-aware ones—underperform when link parameters exceed training regimes, as seen in Sage and Astraea (Mazilu et al., 29 Oct 2025). BBRv3 maintains reasonable throughput but lags if not tuned for fast RTprop adaptation. Model-based protocols benefit from cross-layer handover feedback and shorter RTT filters.
Datacenter interconnects:
Datacenters demand algorithms that react with microsecond latency and guarantee fairness under bursty, highly synchronized workloads. Data-driven methods, including RL policies distilled to low-latency decision trees and deployed in firmware (e.g., on NVIDIA ConnectX-6Dx NICs), deliver superior performance—near line-rate goodput and orders-of-magnitude lower tail latency compared to DCQCN and Swift—across a range of cluster scales. Protocol innovations in DCQCN refinement, as in rev–DCQCN (ECP/ENP/ERP), drastically cut victim slowdowns and control overhead [251