Finite-Precision AC Coding: FPA-CCDM
- FPA-CCDM is a framework that maps binary data into fixed-composition sequences using arithmetic coding under finite-precision constraints.
- It employs quantized interval arithmetic with model rounding and renormalization to ensure rate-optimality and invertibility despite limited precision.
- The technique balances precision, computational complexity, and resource usage, making it suitable for high-throughput systems like 5G and optical communications.
Finite-Precision Arithmetic Coding-based Constant-Composition Distribution Matching (FPA-CCDM) is a framework for lossless distribution matching employing arithmetic coding under finite-precision constraints. It targets the mapping of binary data into sequences with exactly prescribed empirical symbol distributions. FPA-CCDM forms the core of signal shaping architectures for contemporary communication systems, such as probabilistically shaped modulation in optical fiber and 5G. The primary technical focus is on achieving rate-optimality and invertibility despite practical limitations on arithmetic precision, integer word-length, and circuit complexity.
1. Fundamental Concepts and the CCDM Framework
CCDM operates by transforming input sequences of k bits (typically uniform, i.i.d. Bernoulli(½)) into n-symbol output sequences of fixed empirical composition (“type”), chosen to approximate a target distribution over an alphabet of size . Formally, for a composition vector with , the constant-composition set is
and the mapping encoder is constructed such that all output blocks have this exact composition. The maximal number of input bits is , and the corresponding mapping is invertible and fixed-to-fixed length (Schulte et al., 2015).
To minimize informational divergence, is selected (subject to integer constraints) to minimize , that is, to best approximate (Schulte et al., 2015).
2. Arithmetic Coding with Finite Precision
In practical settings, the interval arithmetic operations central to CCDM are performed on integers of bounded word-length rather than real numbers. The FPA-CCDM algorithm extends Ramabadran’s binary finite-precision AC scheme to the general m-ary case (Pikus et al., 2019). Key features include:
- Model rounding: At each symbol emission step, cumulative and branching probabilities are quantized to integer counts using a scale parameter at the -th symbol:
where and denote cumulative and branching probabilities, respectively; is the symbol prior to in lex order.
- Interval representation: Each subinterval is stored as three integers with
Here is the number of precision ("mantissa") bits.
- Renormalization and output: After processing all symbols, the output codeword is the binary integer .
Decoding is the exact reverse process, deterministically extracting the input bit sequence from the interval evolution, ensuring invertibility as long as interval partitioning and renormalization invariants are satisfied (Pikus et al., 2019, Schulte et al., 2015).
3. Rate-Loss Analysis Under Finite Precision
A key analytical result for FPA-CCDM is that finite-precision effects—i.e., the rounding of model statistics and interval endpoints—produce a provable rate loss, but one that diminishes exponentially in the number of precision bits:
where depends on the composition and alphabet (Pikus et al., 2019). This is established via a "peeling-off" argument (Eq. (11) of (Pikus et al., 2019)) showing that every step’s rounding dilates the ideal interval by at most with (for CCDM). The total effect is a worst-case rate loss that is summable over all symbols and expressed exactly for the worst-case codeword as
with being the codeword with all symbols grouped, and the type class probability (Pikus et al., 2019).
Numerical results indicate that practical choices such as suffice for and , yielding negligible rate loss (e.g., bits/symbol) (Pikus et al., 2019).
4. Implementation Complexity and Precision vs. Resource Trade-Offs
FPA-CCDM achieves a trade-off between arithmetic word-length, implementation complexity, and achievable rate. Per-symbol costs include a small and bounded number of integer multiplications, shifts, and a division by (requiring bits). Interval updating and model statistics are maintained with bits (Pikus et al., 2019, Schulte et al., 2015).
Table: Resource scaling for FPA-CCDM across representative block lengths and precision
| Block length | required for | Typical hardware word size |
|---|---|---|
| 64–256 (5G) | 8–12 | 16 bits |
| 1,000–5,000 (Optical) | 12–16 | 24–32 bits |
| 10,000–1,000,000 | 32–64 bits |
For short blocks (e.g., ), even is sufficient for negligible rate loss, with low hardware overhead. For longer blocks, increasing as ensures a target rate gap is maintained (Pikus et al., 2019, Schulte et al., 2015).
5. Extensions: Log-CCDM and Multiplication-Free Approaches
Recent work has addressed the computational overhead from high-precision multiplications and divisions inherent in FPA-CCDM. The "Log-CCDM" construction implements distribution matching based on lookup tables and purely additive log-domain arithmetic, replacing every multiplication/division by LUT indexing and addition/subtraction (Gültekin et al., 2022).
Log-CCDM employs three LUTs: the first stores exponentially spaced intervals, while the others realize approximate log-times and log-divide operations. The required arithmetic precision grows only logarithmically with , not linearly, and storage requirements are reduced to a few kilobytes (e.g., 4 kB for ), while achieving sub-$0.01$ bit/symbol rate loss (Gültekin et al., 2022).
6. Numerical Performance, Rate Recovery, and Invertibility
Empirical results across both standard and log-domain FPA-CCDM show that, for moderate to large , rate loss and normalized divergence both decay rapidly as or increase. For , bits, and , the observed rate tracks the full-precision limit (Shannon entropy of ) within bits/symbol (Pikus et al., 2019, Gültekin et al., 2022).
Invertibility is guaranteed due to the precise interval rounding (always on endpoints, not widths), and because the partitioning property of intervals is preserved at every step in both encoding and decoding. This condition is satisfied provided the mantissa remains above and interval underflow conditions do not occur (Pikus et al., 2019, Schulte et al., 2015).
7. Practical Deployment and System Considerations
FPA-CCDM’s modularity and rigorously bounded rate loss have made it foundational for modern communication system components requiring precise distribution control, including high-throughput probabilistic shaping engines for both block and streaming applications. FPGA and ASIC implementations leverage moderate arithmetic precision and exploit the state storage to enable scalable, hardware-efficient deployment (Pikus et al., 2019, Gültekin et al., 2022).
In summary, FPA-CCDM enables fully invertible, near-optimal fixed-to-fixed length shaping via arithmetic coding under finite-precision constraints, with mathematically bounded rate loss and manageable computational and memory requirements over a broad regime of signal shaping scenarios.