Dynamic PD Ratio Adjustment

Updated 8 March 2026

Dynamic PD ratio adjustment is a mechanism that continuously tunes the balance between two competing components using adaptive control variables for optimal system performance.
It leverages closed-loop feedback and optimization techniques, such as SGD and PID control, to adjust weights or thresholds in applications like neural networks, metric learning, and decentralized finance.
Empirical results demonstrate significant improvements in accuracy, latency, throughput, and capital efficiency, making it a vital approach for modern adaptive system architectures.

Dynamic PD Ratio Adjustment denotes any mechanism in which the relative weight, threshold, or target allocation between two competing or complementary streams (“P” and “D”, where P/D may signify prefill/decode, positive/negative, price/demand, or other domain-specific dichotomies) is explicitly and continuously tuned during system operation. This paradigm transitions a broad array of static hyperparameters or resource splits into data- or context-dependent control variables, with applications ranging from neural network architecture (dynamic skip weighting), contrastive metric learning, LLM inference serving, decentralized finance protocol management, to power systems. The “PD ratio” may be an explicit scalar parameter, a multidimensional vector, or an implicit ratio encoded through adaptive thresholds or resource allocation rules.

1. Core Methodologies and Theoretical Formulations

Dynamic PD ratio adjustment is mathematically formalized as a control or optimization problem, where the ratio parameter (weight, threshold, resource split, target allocation) is adapted to optimize objective performance under varying loads or distributions.

Neural Networks: Residual Path Weighting

In AdaResNet (Su, 2024), each residual block introduces a learnable scalar $w_{tfd}^{ipd}$ replacing the static skip-add rule. The block output $y$ becomes: $y = tfd + w_{tfd}^{ipd} \cdot ipd,$ with $w_{tfd}^{ipd}$ updated by gradient descent or with regularization. This allows per-block, data-dependent reweighting of input (identity) versus transformed features.

Metric Learning: Loss and Mining Thresholds

Dual Dynamic Threshold Adjustment (DDTAS) (Jiang et al., 2024) adapts both sample mining thresholds $(\gamma_{pos}, \gamma_{neg})$ and the Soft Contrastive loss margin $\lambda$ based on the evolving mined positive/negative ratio. The adaptive tolerance mechanism computes new thresholds: $\hat{\gamma}_{pos} = \gamma_{pos} + \kappa\gamma_{pos}\sigma(\xi), \qquad \hat{\gamma}_{neg} = \gamma_{neg} - \kappa\gamma_{neg}\sigma(\xi),$ where $\xi$ is the negative/positive pair ratio and $\sigma$ is sigmoid, steering the effective PD ratio for robust pair selection.

LLM Serving: Phase Disaggregation

Dynamic P/D adjustment architectures for LLM inference (Liao et al., 26 Nov 2025, Hong et al., 28 Apr 2025) formalize optimal resource allocation between prefill and decoding as: $R_{opt} = \frac{cc_d}{OSL} \frac{t_p}{t_d},$ where $y$ 0 is D-instance concurrency, $y$ 1 output sequence length, $y$ 2/ $y$ 3 phase profiled latency, and $y$ 4 instances per phase. Both DOPD and semi-PD implement real-time or windowed control based on system telemetry, profiling, and short-term arrival forecasting.

DeFi Protocols: Target Weight Control

In Perpetual Demand Lending Pools (PDLPs) (Chitra et al., 9 Feb 2025), the PD-ratio is the target portfolio weight vector $y$ 5, continuously adjusted (PID-style): $y$ 6 aligning pool exposure to current market structure and lending demand.

2. Feedback, Optimization, and Update Mechanisms

Adjustment of the PD ratio universally relies on closed-loop feedback: observed imbalance, loss gradients, throughput bottlenecks, extraction of workload statistics, or realized portfolio deviations. Mechanisms include:

Stochastic Gradient Descent (SGD): Directly optimizing scalar/vectors representing the PD ratio via loss gradients (AdaResNet, DDTAS loss threshold).
Windowed Feedback Controllers: Periodic recalibration (semi-PD, DOPD), using recent percentile latency measurements to update phase resource allocation.
Meta-Learning/One-step Lookahead: Treating thresholds as outer-loop variables, optimized with respect to a validation meta-loss (DDTAS).
PID Controllers / Online Averaging: Aggregate historical error, rate-of-change, and current deviation in a time-series of pool weights or resource utilization (PDLPs).
Forecasting and Interpolation: Short-term predictions (ARIMA, data profiling) to proactively set the prospective optimal ratio before workload shifts manifest (DOPD).

Update frequency and aggressiveness (learning rate, step size, interval) are hyperparameters chosen to balance reactivity and stability, often with explicit bounds or smoothing to avoid oscillation.

3. Domain-Specific Architectures and Implementation

Domain	PD Ratio Role	Dynamic Update Mechanism
ResNets	Skip weight, $y$ 7	SGD w/ optional L2 regularizer
Metric Learning	Mining/loss threshold	AT-ASMS feedback + meta-gradient
LLM Serving	Resource split $y$ 8, $y$ 9	Windowed feedback, SLO targeting
DeFi Protocols	Target weight, $y = tfd + w_{tfd}^{ipd} \cdot ipd,$ 0	PID-style time-series control
Power Systems	Tap ratio, $y = tfd + w_{tfd}^{ipd} \cdot ipd,$ 1	Real-time OPF co-optimization

Each architecture embeds the dynamic ratio into the main execution loop (neural net forward/backward, distributed scheduler, lending protocol function), incurring negligible overhead relative to reconfiguration costs or batch computation.

4. Quantitative Gains and Empirical Validation

Dynamic PD ratio schemes consistently yield statistically significant gains over static or manually tuned baselines:

AdaResNet: Up to +54% relative test accuracy improvement on CIFAR-10, and visibly faster convergence (accuracy surpasses static baseline by epoch 5 vs. epoch 10) (Su, 2024).
DDTAS: 2–3% Recall@1 improvement on image retrieval; up to 1% over competing dynamic threshold methods. DDTAS achieves R@1 = 68.4 (CUB-200), 86.4 (Cars-196), and competitive on SOP (Jiang et al., 2024).
DOPD: 1.5× goodput increase vs vLLM baseline, P90 TTFT reduction up to 67.5%, and SLO attainment >99% under production load traces (Liao et al., 26 Nov 2025).
semi-PD: 1.27–2.58× lower latency and 1.55–1.72× more SLO-satisfying requests as compared to static/unified configurations across Llama and DeepSeek models (Hong et al., 28 Apr 2025).
PDLPs: Enhanced capital efficiency, tighter arbitrage bounds, and reduced required delta-hedge as dynamic $y = tfd + w_{tfd}^{ipd} \cdot ipd,$ 2 tracks market conditions (Chitra et al., 9 Feb 2025).

5. Hyperparameter Sensitivity, Stability, and Practical Considerations

Dynamic PD ratio mechanisms introduce new hyperparameters (learning rates $y = tfd + w_{tfd}^{ipd} \cdot ipd,$ 3, regularization $y = tfd + w_{tfd}^{ipd} \cdot ipd,$ 4, controller step size, forecast window, PID gains $y = tfd + w_{tfd}^{ipd} \cdot ipd,$ 5) that govern convergence, bias, and adaptivity.

Robustness: Empirical studies in AdaResNet and DDTAS show that a reasonable range of learning rates yields stable final metrics; out-of-range values either underfit (updates too slow) or destabilize (noisy/adversarial behavior).
Bound Enforcement: Most systems employ explicit clipping, normalization, or projection to maintain constraints on the ratio (e.g., $y = tfd + w_{tfd}^{ipd} \cdot ipd,$ 6; $y = tfd + w_{tfd}^{ipd} \cdot ipd,$ 7; allocations sum to hardware budget).
Update Granularity: Real-time systems (e.g., OPF with tap ratio) may adapt as frequently as every second; LLM serving windows range from ≈1 sec (semi-PD) to several seconds (DOPD).

Mitigation of transient imbalance, oscillatory oversteer, or measurement noise is achieved through conservative step limits, smoothing, or hybrid controller designs.

6. Extensions, Cross-Domain Applications, and Limitations

Dynamic PD ratio adjustment is a generalizable principle:

Neural Network Blocks/Transformers: Applies to balancing identity/updated streams in self-attention/MMoE; plausible extension to sub-layer residuals in very deep transformers (Su, 2024).
Reinforcement Learning / NLP: Useful where the optimal “memory-forget” tradeoff varies across tasks or input distributions.
Multi-resource Dataflow: Generalizes to multidimensional scheduling (heterogeneous GPU types, storage tiers) when more than two phases must be balanced (Liao et al., 26 Nov 2025).
Finance: The dynamic PID target-weight update in PDLPs provides a template for risk-managed pool rebalancing, arbitrage bounds, and capital efficiency under feedback with oracle delays and multi-pool splits (Chitra et al., 9 Feb 2025).

Limitations include sensitivity to rare heavy-tail outliers (ultra-long prompts in LLM serving), lag in feedback or forecasting models, and the need for domain-specific safety constraints (e.g., discount cap, risk eigenvalue monitoring).

7. Summary and Impact

Dynamic PD ratio adjustment transforms rigid, static hyperparameters into closed-loop, context-aware controllers—drastically improving efficiency, stability, and adaptability in systems characterized by producer-consumer asymmetry, compositional data flows, or distributional drift. Its instantiations across neural computing, metric learning, LLM system infrastructure, power systems, and decentralized finance protocols are united by their reliance on measurable feedback, algorithmic optimization, enforceable guards, and empirical validation (Su, 2024, Jiang et al., 2024, Liao et al., 26 Nov 2025, Hong et al., 28 Apr 2025, Chitra et al., 9 Feb 2025, Bliek, 2013). The approach reduces the tuning burden, stabilizes operational metrics under load, and yields quantifiable improvements in accuracy, throughput, goodput, and capital efficiency.