Multiterminal Data Compression

Updated 28 January 2026

Multiterminal data compression is the efficient encoding of correlated data from distributed terminals, ensuring accurate lossless or lossy reconstruction.
It employs frameworks like the Slepian–Wolf model, Gaussian rate–distortion analysis, and cascade architectures to optimize rate regions and balance redundancy.
Fair allocation methods using Shapley values and egalitarian solutions, combined with efficient algorithms such as SPLIT, address computational and scalability challenges.

Multiterminal data compression addresses the efficient encoding of correlated data observed at multiple distributed terminals, subject to reconstruction requirements (lossless or lossy) at a central or function-computing node. The study of multiterminal compression encompasses canonical models such as the Slepian–Wolf network for lossless recovery, generalized multiterminal Gaussian rate–distortion regions, and a variety of topologies including cascades and partial overlaps. Key research directions include rate region characterization, inner and outer bounds for rate–distortion, fairness criteria for rate allocation, and computational aspects of optimal solutions.

1. Foundational Frameworks and Rate Regions

The prototypical multiterminal lossless compression framework is set by the Slepian–Wolf (SW) model, which formalizes the achievable rate region for distributed compression of $n$ correlated sources $Z_V = (Z_1,\dots,Z_n)$ . The feasible rate vectors $r = (r_1, \dots, r_n)$ satisfy the SW conditions: $\sum_{i\in S} r_i \ge H(Z_S | Z_{V \setminus S}), \quad \forall S \subseteq V$

$\sum_{i\in V} r_i = H(Z_V)$

where $H(\cdot)$ denotes Shannon entropy. This region, denoted $R_{SW}$ , is a base polyhedron of the submodular entropy function $H: 2^V \to \mathbb{R}$ (Ding et al., 2018, Ding et al., 2018).

Lossy multiterminal compression, specifically in the Gaussian setting, is framed by the rate–distortion region for correlated sources under quadratic distortion constraints. In the symmetric Gaussian case, $X \sim \mathcal{N}(0, \Sigma)$ with equicorrelated components, the rate–distortion function exhibits reverse water-filling structure, and the performance differs significantly between the centralized ( $m = \ell$ ) and fully distributed ( $m = 1$ ) extremes (Chen et al., 2017). Hybrid models with $m \geq 2$ introduce a hierarchy of achievable regions that interpolate between these extremes, depending on the extent of observation overlap and source correlation.

Cascade multiterminal coding, relevant in MIMO beamforming network topologies, involves sequential encoding/forwarding of data, with rate–distortion constraints derived from in-network function computation or progressive estimation along the chain (Aguerri et al., 2016).

2. Fair Rate Allocation: Shapley Value and Egalitarian Solutions

Fairness in rate allocation is central in multiterminal networks, especially in wireless sensor networks where per-terminal resource burden must be balanced. Two principal game-theoretic solutions are considered:

Shapley Value Allocation: Based on the cooperative cost game with characteristic function $v(S)=H(Z_S)$ , the Shapley value allocates to each terminal $i$ the expected marginal contribution,

$\phi_i = \sum_{S \subseteq V \setminus \{i\}} \frac{|S|! (|V|-|S|-1)!}{|V|!} [H(S \cup \{i\}) - H(S)]$

The Shapley value always lies in the core (i.e., in $R_{SW}$ ) and satisfies axioms of symmetry and additivity (Ding et al., 2018, Ding et al., 2018).

Egalitarian (Max–Min Fair) Allocation: The weighted egalitarian solution minimizes the maximum normalized rate, equivalently the weighted $\ell_2$ -norm over $R_{SW}$ :

$\min_{r\in R_{SW}} \sum_{i=1}^n \frac{r_i^2}{w_i}$

This convex quadratic program admits a unique minimizer due to strict convexity and the polyhedral structure of $R_{SW}$ (Ding et al., 2018).

The Shapley value generally assigns higher rates to sources with more unique information, whereas the egalitarian solution equalizes (normalized) rates as much as allowed by $R_{SW}$ . Empirical studies show the egalitarian solution is advantageous for network lifetime in energy-constrained distributed systems.

3. Algorithmic and Computational Aspects

Computational complexity is a critical barrier in large-scale multiterminal settings. The Shapley value requires an explicit sum over $2^{n-1}$ subsets of $V$ , which is infeasible for large $n$ .

Decomposability and Additive Computation: If the entropy function $H$ can be decomposed across a partition $P$ (i.e., $H(V) = \sum_{C \in P} H(C)$ , indicating mutual independence of blocks), the Shapley value for the global game decomposes into the concatenation of values for each subgame: $\phi_V = \bigoplus_{C \in P} \phi_C$ (Ding et al., 2018). Direct computation then requires only $O(\sum_{C \in P}2^{|C|})$ oracle calls, giving an exponential saving when the largest block size $c_\mathrm{max} < |V|$ .

The SPLIT Algorithm for Egalitarian Solutions: The SPLIT algorithm efficiently computes the egalitarian solution by recursively splitting the terminal set via submodular-function minimizations (SFM). Each split isolates a 'tight' subset and recurses on the remainder, exploiting the submodularity of $H$ . The algorithm is strongly polynomial, using $O(|V| \cdot \mathrm{SFM}(|V|))$ time, and parallelizable since independent subproblems can be handled concurrently (Ding et al., 2018).

A table summarizing computational complexity for rate allocation methods:

Method	Complexity	Parallelization
Shapley Value	$O(2^{\|V\|})$	Partial, via decomposability
SPLIT (Egalitarian)	$O(\|V\|\cdot \mathrm{SFM}(\|V\|))$	Fully parallelizable

4. Multiterminal Compression in Symmetric Gaussian Networks

In the symmetric Gaussian multiterminal problem, encoding performance strongly depends on the encoder observation size $m$ , global correlation $\rho$ , and target distortion $d$ . The following sharp results are established (Chen et al., 2017):

For $\rho \leq 0$ and $m \geq 2$ , the sum-rate matches the centralized rate–distortion limit for all $d < 1$ .
For $\rho > 0$ , there exists an explicit distortion threshold $d_c^{(\ell,m)}$ such that for $d \le d_c^{(\ell,m)}$ , the sum-rate still coincides with the centralized case.
For larger $d$ in the $\rho > 0$ case, the distributed system incurs a finite gap relative to the centralized limit, vanishing as $m \rightarrow \ell$ or $d \rightarrow 0$ in the large- $\ell$ limit.

This establishes that even limited cooperation ( $m \geq 2$ ) or overlap in observations at encoders significantly reduces the redundancy incurred by distributed encoding, especially in large-scale settings. A plausible implication is that system design in sensor networks can leverage partial observation schemes to attain near-centralized efficiency.

5. Cascade and In-network Compression: Lossy Function Computation

Cascade multiterminal scenarios, such as uplink beamforming in distributed MIMO, instantiate source coding for function computation through sequential or in-network compression (Aguerri et al., 2016):

Innovation-only Compression (Improved Routing, IR): Each terminal compresses only the innovation with respect to previous descriptions, exploiting side-information at subsequent nodes. The achievable rate constraints take the form:

$R_l \geq \sum_{i=1}^l I(X_i; U_i \mid U_1,\dots,U_{i-1})$

yielding strictly better performance versus naive routing.

In-Network Processing (IP): Progressive estimation and function computation are performed at each node, with messages encoding sufficient statistics for later nodes, matched to the Wyner–Ziv scenario. The rate-distortion tradeoff is governed by:

$R_l \geq I(X_l, U_{l-1}; U_l \mid X_{l+1})$

Both schemes admit closed-form solutions in the Gaussian case via reverse water-filling, and the progressive scheme typically approaches the information-theoretic outer bound. Numerical results show that in distributed beamforming, in-network processing outperforms both innovation-only compression and standard routing.

6. Applications, Fairness, and Practical Implications

Multiterminal compression underpins designs in wireless sensor networks, fronthaul-constrained cloud radio access networks, and distributed sensing. Fairness-oriented allocations are crucial in these distributed systems to balance energy expenditure, maximize network lifetime, and ensure robust operation.

The SPLIT algorithm's parallelism and strong polynomial guarantee enable scalable deployment, while the decomposable Shapley value approach offers fair allocations when independence structure permits dimension reduction. Comparisons indicate that in practice, the egalitarian solution achieves superior network balance and lower per-terminal burden, especially as network scale increases (Ding et al., 2018).

In cascade architectures, hybrid or progressive in-network strategies yield significant performance benefits for function computation under rate constraints, suggesting design motifs for next-generation distributed systems (Aguerri et al., 2016).

References:

(Ding et al., 2018, Ding et al., 2018, Chen et al., 2017, Aguerri et al., 2016)