Papers
Topics
Authors
Recent
2000 character limit reached

Dynamic Range-Aware Quantization (DRAQ)

Updated 28 November 2025
  • Dynamic Range-Aware Quantization is a method that adapts quantization steps and clipping boundaries to the signal’s empirical range, enhancing fidelity.
  • It optimizes quantization intervals through per-tensor, per-layer, or per-node strategies, balancing accuracy with hardware tradeoffs.
  • Empirical results across sensors, CNNs, and GNNs show that DRAQ minimizes quantization-induced information loss even at low bit widths.

Dynamic Range-Aware Quantization (DRAQ) encompasses a class of quantization methodologies that explicitly model, optimize, or dynamically adapt to the signal or parameter dynamic range in hardware, neural network, and sensor systems. The DRAQ paradigm leverages precise estimation or adaptation of quantization intervals—whether per-tensor, per-layer, per-channel, per-node, or per-time-step—based on data distributional properties or external indices such as time or graph neighborhoods. DRAQ methods improve quantized model fidelity, mitigate information loss due to outlier-induced range expansion, and provide efficient algorithmic and hardware tradeoffs for resource-constrained deployments across diverse domains, from image sensors to deep networks.

1. Definition and Core Principles

Dynamic Range-Aware Quantization refers to quantization schemes where the quantization step sizes, clipping boundaries, or scale factors are chosen based on both the target system’s resource constraints and the empirical or estimated dynamic range of the input data, activations, weights, or other relevant signals. This contrasts with static quantization, where uniform, fixed ranges are employed regardless of the actual signal content.

For a base illustration, consider the one-bit sensor context. The DRAQ concept is formalized by the fact that the quantization step for the estimated probability p^\hat p is Δ=1/N\Delta = 1/N if NN frames are used to derive p^\hat p:

Δ=1/N\Delta = 1/N

where only values in the discrete set {0,1/N,2/N,,1}\{0, 1/N, 2/N, \dots, 1\} are realizable. The resulting exposure estimator, with H=ln(1p^)H = -\ln(1-\hat{p}), implies that both the minimum resolvable exposure and the maximum measurable exposure are direct functions of NN:

HqntzΔ=1/N;H+qntz=ln(N)H_{-qntz} \approx \Delta = 1/N \quad ; \quad H_{+qntz} = \ln(N)

and the quantization-limited dynamic range is:

DRqntz=H+qntzHqntzNlnNDR_{qntz} = \frac{H_{+qntz}}{H_{-qntz}} \approx N\ln N

(Koerner, 2023).

Generalizations of DRAQ principles to high-dimensional settings follow similar logic—explicit modeling of clipping factors, learnable scale/offsets, and per-domain strategies for adapting quantization intervals.

2. Representative DRAQ Algorithms Across Domains

A wide variety of schemes instantiate DRAQ in neural network quantization, image sensor readout, and even graph-structured models:

  • Frame-based DRAQ in one-bit sensors: The quantization of photon hit rates over NN frames yields quantization steps of $1/N$, capping the estimable dynamic range at ln(N)\ln(N). This imposes an explicit tradeoff between temporal resolution, bandwidth, and dynamic range for direct-detection sensors (Koerner, 2023).
  • REQuant for CNN post-training quantization: DRAQ is realized as a per-layer, per-weight-scale clipping parameter α\alpha minimizing mean-squared quantization error, efficiently determined via golden-section search due to local unimodality and convexity (Yang et al., 5 Oct 2025). Non-uniform extensions (power-law transformations) further adapt to heavy-tailed weight distributions.
  • Probabilistic input-adaptive DRAQ: Per-input quantization parameters are set using moment estimates of the pre-activation distribution, computed analytically from the input vector and stored layer-wise meta-parameters—yielding a fast, memory-light, data-dependent quantization (Santini et al., 15 May 2025).
  • Temporal DRAQ for diffusion models: A small MLP predicts quantization scale and offset as a function of generation time step, enabling scale factors that adapt to highly non-stationary activation statistics without runtime overhead at inference (So et al., 2023).
  • Node-aware DRAQ in GNNs: Each node embedding is quantized with individually initialized and dynamically refined intervals propagated via message passing and updated by neighborhood aggregation, avoiding information loss in collaborative filtering workloads (Li et al., 22 Aug 2025).
  • Differentiable DRAQ (DDQ): All quantization levels, bitwidths, and dynamic ranges are made differentiable and learnable, with quantization grid boundaries optimized via gradient descent, supporting mixed-precision and adaptive resolution (Zhaoyang et al., 2021).
  • In-hindsight DRAQ in training: Quantization of activations and gradients is performed using quantization ranges accumulated from previous iterations, eliminating the memory cost associated with dynamic range estimation for each forward/backward pass (Fournarakis et al., 2021).
  • DRAQ for learned image compression on FPGAs: Statistically-calibrated activation clipping and outlier-aware weight regularization reduce catastrophic quantization-induced rate-distortion penalties, supporting bitwidth tuning and energy-efficient deployment (Fang et al., 21 Nov 2025).

These approaches vary in whether the dynamic range adaptation is explicit (analytic or learned), at what scale the intervals are defined (per-layer, per-channel, per-node, per-time), and in how the adaptation is realized (optimization, calibration, or surrogate modeling).

3. Mathematical Formalization and Optimization

The mathematical framework of DRAQ decomposes into step size and clipping parameter selection strategies. Typical core elements include:

  • Quantization step size (uniform):

s(α)=αwm2b11s(\alpha) = \frac{\alpha w_m}{2^{b-1}-1}

with wmw_m the maximal absolute weight and α(0,1]\alpha\in(0,1] a learnable or optimizable parameter (Yang et al., 5 Oct 2025).

  • Power-law/non-uniform quantization: Employs transformations such as w~=sign(w)w\tilde{w} = \text{sign}(w)\sqrt{|w|} to compress heavy-tailed distributions, with range selection as before.
  • Interval selection (probabilistic):

Δ(x)=(α+β)σz(x)2b1\Delta(x) = \frac{(\alpha+\beta)\sigma_z(x)}{2^b-1}

where μz(x),σz(x)\mu_z(x),\,\sigma_z(x) are input-conditional moments estimated via surrogate layers or closed-form analytic expressions (Santini et al., 15 May 2025).

  • Optimizing the clipping parameter:

α=argminα(0,1]1WwW(ws(α)wq)2\alpha^* = \arg\min_{\alpha\in(0,1]} \frac1{|W|}\sum_{w\in W}(w-s(\alpha)\cdot w_q)^2

exploiting the fact that the objective is piecewise smooth and locally convex (Yang et al., 5 Oct 2025).

  • End-to-end differentiable quantization: Quantization points qiq_i and bitwidth gating variables gig_i are made differentiable and trained under a composite objective L=Ltask+λLmemL = L_\text{task} + \lambda L_\text{mem} (Zhaoyang et al., 2021).

These optimization strategies are analytically tractable, converge rapidly on practical problems, and provide closed-form or iterative rules for deployment-calibrated quantization.

4. Practical Implementation and Tradeoffs

DRAQ imposes both hardware and algorithmic considerations:

  • Sensor/frame DRAQ: The number NN of collected frames directly determines Δ=1/N\Delta=1/N, so a designer must select Nmax(exp(Hmax,scene),1/Hmin,scene)N\geq\max(\exp(H_{\max,\text{scene}}),1/H_{\min,\text{scene}}) for a target exposure range (Koerner, 2023).
  • CNN quantization: Layerwise DRAQ schemes based on analytic error minimization or input-adaptive quantization reduce the representation loss at low bitwidths (even 4 bits) with negligible performance drop and avoid the excessive memory or computational overhead of true per-sample or dynamic per-tensor quantization (Santini et al., 15 May 2025, Yang et al., 5 Oct 2025).
  • GNNs and collaborative filtering: Node-level quantization intervals, continually refined via message passing, allow for minimal accuracy loss under 2-bit quantization, outperforming baseline methods and providing a direct mechanism to couple quantization precision to graph topology and semantic clusters (Li et al., 22 Aug 2025).
  • Learned image compression: Statistically grounded activation clipping (with k=625λ+2k=625\lambda+2 for the clipping multiplier) and outlier regularization (α\alpha-percentile bounds, small β\beta) allow 8-bit integer deployment on FPGAs while reducing the BD-rate overhead from 30%30\% to 6.3%6.3\% relative to FP32 (Fang et al., 21 Nov 2025).

The following table summarizes key tunable parameters in DRAQ across domains:

Domain Main DRAQ Parameter(s) Selection Rule
One-bit sensor NN (frames) Nmax(exp(Hmax),1/Hmin)N\geq\max(\exp(H_{\max}),1/H_{\min})
CNNs (REQuant) α\alpha (clipping) Minimize MSE via golden-section search
CNNs (probabilistic) α\alpha, β\beta (coverage) Chosen via calibration on small data, e.g., 99.9%99.9\% mass
Image compression kk (clip), α\alpha, β\beta k=625λ+2k=625\lambda+2 for activation, α\alpha, β\beta grid
GNNs (GNAQ) mini\min_i, maxi\max_i per node Updated by message-passing on embedding neighborhoods

Careful selection, calibration, or joint learning of such parameters is central to DRAQ performance.

5. Empirical Evaluation and Comparative Performance

DRAQ has been empirically validated on various benchmarks, demonstrating improved tradeoffs between accuracy and performance loss due to quantization:

  • Post-training quantization (REQuant): Achieves near lossless accuracy at 8 and 6 bits, and state-of-the-art accuracy at 4 bits (e.g., ResNet-50 4/4: 94.47% vs. 92.63% for OMSE) (Yang et al., 5 Oct 2025).
  • Lightweight DRAQ for embedded vision: Input-adaptive, probabilistic DRAQ approaches drop in-domain mean average precision by 1.6%1.6\% (vs. 8%8\% for static quantization) and incur only 10–25% additional latency—substantially lower than conventional dynamic range estimation (Santini et al., 15 May 2025).
  • Temporal DRAQ in diffusion models: At W8A4, TDQ-LSQ achieves FID=4.56 versus 6.20 for static baseline; QAT on LDM/LSUN-Churches with 4/4 bits at FID=4.64 (So et al., 2023).
  • GNN collaborative filtering: Node-aware DRAQ increases Recall@10 and NDCG@10 by 27.8% and 17.6%, respectively (2-bit quantization, four public datasets) compared to strong baselines (Li et al., 22 Aug 2025).
  • LIC for FPGAs: DRAQ reduces BD-rate overhead from 30% (QAT baseline) to 6.3%; activation clipping alone reduces to 11.3%, and weight regularization to 18.9% (Fang et al., 21 Nov 2025).
  • Differentiable DRAQ: Achieves full-precision (71.9%) performance even at 4-bit quantization on ImageNet/MobileNetV2 with only 30 epochs, outperforming previous approaches (Zhaoyang et al., 2021).
  • In-hindsight DRAQ for training: Matches or exceeds dynamic min-max methods on top-1 accuracy (ResNet18: 58.99% vs. 58.77%) with 4–8× lower memory movement (Fournarakis et al., 2021).

6. Generalizations, Extensions, and Theoretical Insights

DRAQ is not limited to uniform quantization, specific architectures, or offline model optimization. The methodology extends as follows:

  • Multi-bit & non-uniform extensions: For bb-bit sensors or quantizers, the principle that step size Δ=1/N\Delta=1/N (or per-bucket normalization in histograms) governs the achievable dynamic range generalizes, regardless of the nonlinearity of the readout function (Koerner, 2023).
  • Dynamic indices: Any predictable, non-stationary variation—temporal, positional, or hierarchical—can be used to drive DRAQ scale generation via MLPs, Fourier features, or context-dependent embedding (So et al., 2023).
  • Neighborhood-based quantization: In GNNs and similar topologies, dynamic range adaptation can be relational—based on local context rather than global statistics—enabling information-preserving, low-bit quantization even in highly structured or sparse data (Li et al., 22 Aug 2025).
  • Analytical and hardware-efficient objectives: DRAQ aligns with closed-form or provably convex local objectives, supporting both efficient optimization and theoretical error guarantees. Lightweight surrogate models further render per-input, per-layer adaptation feasible under stringent memory constraints (Santini et al., 15 May 2025).

DRAQ thereby provides a universal framework for quantization-aware optimization, efficient hardware deployment, and principled signal fidelity control across the signal processing and deep learning landscape.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Dynamic Range-Aware Quantization (DRAQ).