Learnable Companding Quantization (LCQ)
- Learnable Companding Quantization (LCQ) is a method that integrates parameterized, learnable companding functions into quantization to adaptively place levels based on task-driven optimization.
- It reduces quantization error and maintains performance at low bitwidths by allocating more quantization levels to critical signal regions using techniques like piecewise-linear or power-law companding.
- LCQ is applied in neural network inference, sensor ADC pipelines, and learned image compression, demonstrating improved accuracy and efficiency compared to traditional uniform quantization.
Learnable Companding Quantization (LCQ) refers to a family of quantization schemes in which a parameterized, learnable companding (compressing and expanding) function is integrated into the quantization process. This approach enables adaptive, non-uniform placement of quantization levels, shaped via task-driven optimization rather than analytic or heuristic rules. LCQ has shown significant benefits across domains including low-bit deep neural network inference, sensor-to-network pipelines, and end-to-end neural compression, consistently reducing bitrates or quantization error at fixed task performance while maintaining computational efficiency.
1. Mathematical Formulations of Learnable Companding Quantization
All LCQ methods begin with a companding function , parameterized by , that transforms the dynamic range of scalars (weights, activations, or raw measurements) before quantization. This transformation is typically non-linear, allowing denser placement of quantization points around high-importance value ranges, and more sparse placement elsewhere.
In major LCQ frameworks:
- Piecewise-Linear Companding (Activation/Weight Quantization): For , is mapped via a learnable compressor (with intervals), then quantized uniformly, and finally reconstructed by an expander , yielding the overall quantized value:
with
Here, is a uniform -bit quantizer on , with and both parameterized by learnable slopes and breakpoints (Yamamoto, 2021).
- Power-Law or A-Law Companding (ADC/Compression): For input (e.g., or ), the companding functions take the form:
$\gamma(x; \gamma) = x^\gamma \qquad \text{(for $x \ge 0$)},$
or
These are followed by affine scaling and rounding to an -bit quantizer, with optional analytic inverse for data reconstruction (Fatima et al., 26 Sep 2025).
In learned image compression, a spatially-adaptive A-law compander is estimated per spatial location by a neural net: for , and a logarithmic mapping for higher (Zhang et al., 2023).
2. Optimization and Gradient Flow
LCQ systems jointly optimize the companding parameters together with network weights or subsequent processing stages. The inherent non-differentiability of the rounding/quantization operation is handled using the straight-through estimator (STE):
- STE for Quantization: For scalar rounding,
so the gradient is passed as if rounding were the identity map.
- For piecewise-linear companders in LCQ, gradients with respect to the companding parameters are computed by backpropagating through the compressor and expander using analytic chain rule decomposition (Yamamoto, 2021).
- All LCQ parameters—clipping thresholds, companding function slopes, and, where present, spatial or channelwise adaptation variables—are trained under the task-specific objective (e.g., classification cross-entropy, detection loss, or rate-distortion tradeoff) using standard optimizers such as SGD or Adam (Yamamoto, 2021, Fatima et al., 26 Sep 2025, Zhang et al., 2023).
3. Architectural Variants and Application Contexts
LCQ methods have been deployed in several technologically important settings:
- Neural Network Quantization: The canonical LCQ method uses a learnable piecewise-linear compressor and expander for both weights and activations, followed by uniform quantization. Layerwise "limited weight normalization" is often introduced at the quantization boundary for stability at ultra-low bitwidths (Yamamoto, 2021).
- Sensor/ADC Quantization: A learnable companding law (e.g., power law , possibly with offset ) is applied to raw sensor readings prior to quantization, with parameters trained jointly with the downstream network. Both unsigned and signed input domains are supported, and the companding function remains low-dimensional (few parameters) for hardware feasibility (Fatima et al., 26 Sep 2025).
- Image Compression: In end-to-end learned image compression schemes, spatially adaptive companding functions (A-law form with per-location scale predicted from local features) are used to warp latent codes before vector quantization. The companding parameters are output by a small CNN, yielding fine-grained adaptation to local statistics, followed by a diamond-lattice vector quantizer (Zhang et al., 2023).
4. Empirical Performance and Task-Level Impact
LCQ schemes consistently outperform uniform and traditional non-uniform quantizers at low bitwidths:
| Setting | Uniform Baseline (2–4 bit) | LCQ (2–4 bit) | Full Precision Gap |
|---|---|---|---|
| ImageNet (ResNet-50) (Yamamoto, 2021) | 73.8–75.6% top-1 | 75.1–76.6% top-1 | ≤1.7% (2 bit) |
| COCO Detection (ResNet-50) (Yamamoto, 2021) | 36.2 AP (W4/A4) | 37.1 AP (W4/A4) | 1.2 |
| HAR (WEAR, DeepConvLSTM) (Fatima et al., 26 Sep 2025) | 63.7 (4 bit linear) | 70.1 (4 bit γ-Quant) | 0.9 (vs 12 bit) |
| Raw Image Object Detection (Fatima et al., 26 Sep 2025) | 0.31 mAP (4 bit linear) | 0.47 mAP (γ-Quant) | ≈same as sRGB |
| Learned Compression BD-Rate (Zhang et al., 2023) | +3.89% (uniform) | +3.12% (LVQAC best) | 0.77% |
On ImageNet and COCO, LCQ eliminates 60–90% of the accuracy gap to full-precision at 2–4 bit depth (Yamamoto, 2021). In ADC or sensor-driven applications, learned companding at 4 bits matches or exceeds 12-bit linear or ISP-sRGB performance (see Table above) (Fatima et al., 26 Sep 2025). In neural compression, replacing uniform scalar quantization with spatially adaptive companding and vector quantization yields BD-rate reductions up to 0.5 pp on standard datasets (Zhang et al., 2023).
5. Principles of Adaptivity
A central principle of LCQ is the adaptation of quantization point density to either learned statistics or locality:
- Statistical Adaptivity: Companding parameters are optimized to place quantization levels where signal or weight distributions are densest or most task-relevant, refining over time as training progresses (Yamamoto, 2021, Fatima et al., 26 Sep 2025).
- Spatial/Channel Adaptivity: In image compression, per-location companders are predicted from spatial context, enabling dynamic range adaptation to local feature map variance (Zhang et al., 2023). In sensor pipelines, learned companding is often performed per input channel (e.g., accelerometer axis) or dataset-wide (Fatima et al., 26 Sep 2025).
- Hardware Feasibility: LCQ applied at the sensor (ADC) level employs extremely low-dimensional (often single-parameter) companding functions, ensuring ease of hardware realization and minimal overhead (Fatima et al., 26 Sep 2025).
6. Training Considerations and Extensions
Key methodological components for stable and performant LCQ training include:
- Straight-Through Estimator: Uniformly used to enable gradient flow through quantization's non-differentiable steps (Yamamoto, 2021, Fatima et al., 26 Sep 2025, Zhang et al., 2023).
- Limited Weight Normalization: For quantization of neural network weights, applying standardization at the quantizer interface (not throughout the layer) substantially improves stability at very low bitwidths (Yamamoto, 2021).
- Companding Interval Resolution: In piecewise-linear LCQ, increasing the number of companding intervals () from 4 to 16 produces measurable gains at 2–3 bits with diminishing returns beyond (Yamamoto, 2021).
- Extensions: Potential avenues include joint (vector) companding for entire weight or activation vectors under shared parameters, learned non-uniform breakpoints, dynamic (layerwise) bitwidth selection, and applying spatial adaptation more generally outside image compression (Yamamoto, 2021, Zhang et al., 2023).
7. Significance, Trade-Offs, and Broader Impact
Learnable companding quantization unlocks several benefits across efficiency-critical deployments:
- Memory and Throughput: LCQ allows neural networks at 2–4 bit precision to retain accuracy close to full-precision counterparts, facilitating memory- and bandwidth-constrained scenarios (Yamamoto, 2021).
- Sensor Power Consumption: In ADC-driven pipelines, reducing bitdepth from 12–16 down to 2–4 with learned companding substantially lowers energy usage, which can comprise up to half of sensor power (Fatima et al., 26 Sep 2025).
- Compression Performance: In learned image compression, spatially adaptive companded quantization delivers consistent rate–distortion improvement over uniform quantization, with minimal added inference cost (Zhang et al., 2023).
- Task Preservation: Across all domains, non-uniform level placement via learnable companding consistently preserves task-relevant information, quantization error is distributed to least-critical regions of the signal, and empirical results demonstrate either matching or exceeding standard baselines at fixed bitrates.
A plausible implication is that LCQ-based quantization will form a core primitive in emerging low-resource, bandwidth-limited, and edge-deployed AI systems, where every quantization bit and fraction of a percent accuracy is significant. The modularity and training synergy of companding functions with end-task objectives enable tailored quantization, efficiently adapted to both data and hardware constraints.