Bit-Interleaved Data Packing

Updated 4 October 2025

Bit-interleaved data packing is a set of strategies that distribute coded bits across multiple domains to improve error resilience, hardware utilization, and parallel processing.
Its algorithmic implementations, including optimized interleaver designs and packing factors, enable high-throughput performance in FPGA accelerators and cryptographic systems.
Optimized mapping and constellation shaping techniques enhance decoding thresholds and spectral efficiency in modern coded modulation and iterative decoding systems.

Bit-interleaved data packing refers to a family of strategies wherein coded data bits are distributed, rearranged, or embedded across multiple domains—such as symbols, channels, coefficients, or memory banks—in a manner that substantially improves error resilience, parallelism, spectral efficiency, or hardware utilization. Driven by diverse requirements in communication theory, hardware acceleration, encryption, and compression, the concept underpins high-performance system design for MIMO systems, LDPC codes, FPGA datapaths, homomorphic encryption, and neural codecs.

1. Interleaver Design Criteria for Error Resilient Communications

In coded modulation systems, the interleaver determines how coded bits are distributed across the physical resources (symbols, subcarriers, spatial streams). For bit-interleaved coded multiple beamforming (BICMB) systems (0807.2464), two critical criteria are enforced for all error events:

Symbol Separation of Consecutive Bits: Each coded bit in a sequence must be mapped to a distinct symbol. Mathematically, consecutive bits $\{b_1, b_2, \ldots\}$ are mapped such that no two bits in the same error event occupy the same symbol, reducing error burst sensitivity and increasing diversity.
Stream Participation: For $S$ spatial streams, each stream must contribute at least once to any error event:

$\forall \text{ error paths}, \quad a_s \geq 1, \, s \in \{1,\ldots,S\},$

where $a_s$ is the number of times stream $s$ is used with channel bit $1$.

For OFDM systems, consecutive bits must also be transmitted over distinct subcarriers, maximizing both spatial and frequency diversity. These design rules guarantee that no error pattern can bypass the diversity of the entire system, and are sufficient for full diversity in BICMB systems when the free distance $d_{\text{free}}$ is replaced by the Hamming distance $d_h$ in performance analyses. The implementation of these interleaver criteria—especially in multi-stream systems—can become complex, requiring careful combinatorial design to maintain symbol and stream separation for all paths.

2. Algorithmic Bit-Interleaved Packing in Cryptography and Accelerators

Bit-interleaved packing is foundational for high-throughput computation and communication in constrained environments. In privacy-preserving federated learning, FedBit (Meng et al., 27 Sep 2025) employs bit-interleaved packing to embed multiple quantized weights into each polynomial coefficient of a BFV ciphertext:

$c_{i,j}^{(\ell)} = \sum_{k=0}^{m_\ell-1} w_{(iN+j)m_\ell+k}^{(\ell)} \cdot 2^{k(\beta_\ell+\delta_\ell)}$

Here, $m_\ell$ is the packing factor for layer $\ell$ , each weight is quantized to $\beta_\ell$ bits with a carry protection margin $\delta_\ell$ , and $N$ is the coefficient count. This packing drastically reduces ciphertext expansion and enables efficient SIMD-style polynomial operations on dedicated FPGA accelerators, yielding speedups of up to $894\times$ for encryption and $60.7\%$ lower communication overhead.

In FPGA DSP blocks (Sommer et al., 2022), bit-interleaved packing allows multiple low-bitwidth multiplications to be performed in a single wide operator. For example, four $4$-bit multiplications can be executed in parallel via controlled bit-offsets, generalizable as:

$r = \mathbf{a} \cdot \mathbf{w}^T = \sum_{i,j} (a_i w_j) \cdot 2^{a_\text{off}[i] + w_\text{off}[j]}$

Advanced methods such as "Overpacking" (packing with negative bit offsets) are introduced, supporting up to six concurrent $4$-bit multiplications per DSP block at minor accuracy cost.

3. Optimized Mapping and Shaping for Channel and Code Performance

Bit-interleaved coded modulation (BICM) is central to modern coded communication. The efficient distribution of bits via optimized bit mappers enhances performance over parallel channels and complex modulations (Häger et al., 2013, Liao et al., 2021). An assignment matrix $A \in \mathbb{R}^{m \times L}$ is constructed such that columns sum to $1$, controlling the allocation of variable nodes across $m$ bit channels ( $A_{i,j}$ is the fraction of bits at position $j$ mapped to channel $i$ ).

Optimization methods—including differential evolution—seek to maximize the decoding threshold $\alpha^*(A)$ or minimize decoding iterations for spatially coupled LDPC codes. Results show significant improvements: optimized mappers reduce the required chain length (e.g., $L$ drops from $40$ to $25$ for a fixed gap to capacity); additionally, in tail-biting ensembles, decoding waves are initiated by locally enriched mappings.

Constellation shaping (Valenti et al., 2012) further refines bit packing by biasing shaping bits to select lower-energy subconstellations more frequently. This is formalized as:

$R = R_c \left[ m + g(R_s - 1) \right]$

where $R$ is total rate, $R_c$ LDPC rate, $g$ shaping bits per symbol, $R_s$ shaping code rate. The combination of shaping, iterative decoding, and LDPC optimization yields gains exceeding $1$ dB in AWGN.

4. Packing Strategies for Data Storage, Compression, and Streaming

Data packing markedly improves memory bandwidth utilization in accelerators (Ferry et al., 22 Jan 2024). Rather than transmitting data aligned to hardware word sizes (e.g., $32$ bits with $17$-bit payloads), the approach packs contiguous custom-width values (so-called Maximal Atomic irRedundant Sets, MARS) into bit-level streams, eliminating padding bits. This is formalized as:

$\text{Packed\_Stream} = \text{BitConcat}(a_0, a_1, \ldots, a_n)$

Runtime compression further reduces traffic by encoding differences between sequential values, e.g., using a length indicator and sign bit for the difference $\Delta = w_i - w_{i-1}$ , delivering up to $7\times$ reduction in I/O cycles.

In video codecs (Said et al., 2023), overhead from multiple parallel streams is reduced by bidirectional bitstream packing—concatenating one forward and one reverse stream to share entry points and termination bytes:

Bitstream overhead: $W(D, N_s; \alpha, \beta) \approx (N_s[\alpha \log_2(D/N_s) + \beta])/D$
Bidirectional packing reduces $N_s$ , leading to less than $1\%$ overhead for $95$-byte streams, and under $0.1\%$ for $1200$-byte streams.

5. Physical Layer Bit Interleaving for Robustness under Interference

Bit interleaving at the physical layer disperses coded bits across temporal and frequency domains to resist burst and impulse errors (Zhan et al., 2022). Various interleaver designs—packet block, symbol block, 3GPP QPP, S-random—spread errors, increasing the likelihood that error correction codes can recover from burst interference.

For short-packet WirelessHP transmissions, interleavers such as the 3GPP QPP ( $T(i) = f_1 i + f_2 i^2 \mod K$ ) and S-random (spacing adjacent bits by at least $S$ positions) outperform simpler block schemes, especially when operating under severe impulse interference defined by

$T = 10 \log_{10}(P_{\text{impulse}} / P_{\text{packet}})$

Packet error rates remain low up to $T \approx 17.5$ dB, but escalate beyond that point, highlighting the limitations of interleaving under extreme conditions.

6. Theory-Driven Multilevel and Lattice Packing

Sphere packing and multilevel constellation design exploit bit-interleaved architectures for maximum packing density (Bollauf et al., 2018). Construction C* generalizes classic Construction C by partitioning codewords into $L$ bit-groups and mapping via

$\mu(c_1, c_2, \ldots, c_L) = c_1 + 2c_2 + \cdots + 2^{L-1}c_L$

The resulting constellation $\Gamma_{C^*}$ achieves geometric uniformity for $L=2$ and is “nonlattice” in general, but under sufficient conditions (nested antiprojections, Schur closure) becomes a lattice. Asymptotic packing efficiency ( $\rho_{\text{pack}}$ ) reaches the Minkowski bound ($1/2$) for $C^*$ , exceeding traditional constructions.

7. Bit-Interleaved Packing for Iterative Decoding and Modulation

Bit-interleaved packing directly interfaces with iterative demapping and decoding frameworks for bandwidth-efficient systems (Fang et al., 2021). In bit-interleaved coded modulation (BICM) with protograph-based LDPC codes and spatial coupling, bits are mapped for optimal variable node protection, including schemes such as Variable Degree Matched Mapping (VDMM), Two-Stage Lifting Aided Mapping (TSLM), and Spatial-Position Matched Mapping (SPMM). The iterative process

$\text{[Channel]} \rightarrow \text{[Soft Demapper]} \leftrightarrow \text{[BP Decoder]}$

leverages soft information feedback between the demapper and the decoder, enhancing capacity-approaching performance across diverse channels (AWGN, fading, Poisson, flash-memory).

Conclusion

Bit-interleaved data packing serves as a versatile paradigm unifying principles from modulation theory, code design, data storage, hardware acceleration, cryptographic aggregation, and robust wireless communications. By distributing, embedding, and aligning coded bits across resources—guided by rigorous interleaver and packing criteria—it unlocks substantial gains in diversity, parallelism, error resilience, efficiency, and hardware utilization. As demonstrated across the referenced works, careful design and optimization of bit-interleaved packing strategies are critical for realizing next-generation system performance in both communication-theoretic and computational domains.