Pliable Index Coding Instances

Updated 13 December 2025

Pliable index coding is an information-theoretic broadcast problem where clients decode any unknown message to achieve minimal transmissions.
The framework employs combinatorial designs and hypergraph representations to derive both upper and lower bounds on the optimal code length.
Advanced approaches, including a greedy bucketing algorithm, enable nearly optimal performance and adapt to decentralized, secure, and multi-request settings.

A pliable-index-coding instance is an information-theoretic broadcast problem in which a server (or a decentralized set of users) holds a library of messages and a collection of clients, each knowing a subset of the messages as side information, are satisfied by recovering any one (or, more generally, any $t$ ) message(s) missing from their side information. In contrast to traditional index coding, which requires delivering a specific demand for each client, the decoding objective in pliable index coding (PICOD) is satisfied if each client obtains any unknown message(s), permitting the assignment of demands to clients to be chosen by the system for optimal performance. The minimum code length achieving universal satisfaction of clients with this maximal flexibility constitutes the central combinatorial and algorithmic object of study.

1. Formal Structure and Problem Classes

A general pliable-index-coding instance is specified by a tuple $(\mathcal{M}, \mathcal{C}, \{S_i\})$ :

$\mathcal{M} = \{x_1, \ldots, x_m\}$ is the set of $m$ independent messages over a finite field $\mathbb{F}_q$ .
$\mathcal{C}$ is a set of $n$ clients, with client $i$ holding side-information $S_i \subsetneq \mathcal{M}$ .
Each client $i$ is satisfied if it can recover any $t$ messages in $\mathcal{M} \setminus S_i$ .
The goal is to minimize the number of transmissions (linear or, more generally, vector, scalar, or non-linear codes) so that all clients are satisfied.

The problem generalizes to groupings of messages, side-information with additional constraints (such as consecutive or structured sets), and multi-request settings where clients must recover more than one new message. Notable special classes include:

Complete- $S$ : Every possible side-information set of size $s \in S$ is present as a client.
Group-complete: Messages are grouped, and side information consists of groups.
Decentralized/Distributed: Transmissions can be generated locally according to each user's knowledge only (Liu et al., 2019, Kadakkottiri et al., 1 Jul 2025).
Secure: Each client is allowed to decode only one unknown message and should not obtain information about any other (Liu et al., 2020, Liu et al., 2020).

2. Algorithmic and Structural Results

The core theoretical findings for pliable-index-coding instances are as follows:

Computing the minimum code length is NP-hard, via a reduction to the Minimum Hitting Set problem. Any code of length $K$ corresponds to a hitting set of size $K$ , demonstrating computational hardness (Song et al., 2016).

Nevertheless, a deterministic, greedy two-level bucketing algorithm (PLICODE) provides an

\mathcal{O}(\log^2(n))

upper bound on the code length, which is nearly worst-case optimal. Each step sorts unsatisfied clients into buckets based on the size of their missing set, applies a greedy covering, and iterates; see the pseudocode in (Song et al., 2016):

function PliCode(S_1,...,S_n)
    U ← {1,...,n}; G ← []; T ← 0
    while U ≠ ∅
        bucket U by missing set size
        for each bucket B_ℓ
            cover at least half of B_ℓ by O(log n) transmissions
            remove satisfied clients from U
        end
    end
    return G

The best-known lower bound is $\Omega(\log n)$ , tight for some adversarially constructed instances (Song et al., 2016). For multi-request ( $t$ -request) cases, the upper bound is $\mathcal{O}(t\log n+\log^2 n)$ and the lower bound is $\Omega(t+\log n)$ (Song et al., 2016).
Pliable index coding admits an exact characterization in terms of minrank over a family of "mixed matrices": for each client, messages they already know correspond to zeros, and allowable message positions yield "free variables." The code length is then the minimum rank achievable by such a completion (Song et al., 2016).

3. Extremal and Special-Structure Instances

Tabular summary of extremal results for certain classes:

Instance Class	Achievable Code Length	Worst-case Lower Bound
General; $n$ clients	$O(\log^2 n)$	$\Omega(\log n)$
Complete- $S$ , $t=1$	$\min\{s_{\max}+1, m-s_{\min}\}$	tight (Liu et al., 2018)
Group-complete ( $g$ groups, singleton $S$ )	See Section 4	See Section 4
Random side information, $n\sim m$	$O(\log^2 n)$ w.h.p	$\Omega(\log n)$ w.h.p
Combinatorial circular/consecutive side info	Table in (Sasi et al., 2019)	exact; various parameter regimes

In group-complete- $S$ settings with $m$ groups of size $g$ and singleton $S = \{s\}$ , the optimal code length jumps from $s+t$ to $g(m-s)$ as the request threshold $t$ crosses $(g-1)(m-s)$ (Eghbal et al., 2024).

4. Hypergraph Representations and Structural Bounds

Pliable-index-coding instances are canonically associated with hypergraphs $\mathcal{H} = (V, E)$ :

Vertices: messages
Hyperedges: client request sets (equivalently, side-information sets as their complements)
Maximum degree $\Delta(\mathcal{H})$ gives an achievability bound: code length $\leq \Delta(\mathcal{H})$ via a greedy degree-based algorithm (B. et al., 2022).
The nesting number $\eta(\mathcal{H})$ , measuring the length of the longest possible nested chain of side-information sets, provides a converse: code length $\geq \eta(\mathcal{H})$ (B. et al., 3 Nov 2025, B. et al., 2022).
For low-degree hypergraphs ( $\Delta \in \{1,2,3\}$ ), the optimal code length is determined exactly by degree and nesting (B. et al., 2022).
Conflict-free colorings and t-strong coloring variants on the hypergraph offer further algorithmic schemes, yielding $O(\log^2 \Gamma)$ code lengths for overlap $\Gamma$ and near-optimal vector codes for multi-request (Krishnan et al., 2021).

5. Bounds via Absent Receivers, Chains, and Criticality

The broadcast rate lower bounds for general instances make essential use of absent receivers:

By constructing a decoding chain—an ordered sequence of message inclusions indexed by observed absent side-information sets—one derives that the code length is at least $m - L^*$ , where $L^*$ is the minimal number of "skips" needed to complete the chain, maximized over all decoding choices (Ong et al., 2019, Ong et al., 2019, Ong et al., 6 Dec 2025).
The simplification to maximal nested chains of absent side-information sets permits explicit calculation of optimal code length for many instances, particularly those with up to four absent receivers or perfect $L$ -nested or truncated-nested absent set structures (Ong et al., 2019, Ong et al., 2019, Ong et al., 6 Dec 2025).
Certain families are critical: adding any one absent receiver strictly increases required broadcast length. For example, perfectly $L$ -nested absent sets on a partition of messages, code length jumps from $m-L$ to $m-L+1$ with any addition (Ong et al., 2019).
These methods subsume and strengthen previous information-theoretic bounds (MAIS-type) and connect directly to hypergraph-theoretic and coloring formalisms.

6. Decentralized, Secure, and Application-Driven Pliable Index Coding

Several variants capture scenarios in which coded symbol generation is distributed:

Decentralized PICOD: Users communicate without a central transmitter; for "pliable" (not classical) demands, centralized and decentralized code lengths match in all but non-pliable degenerate cases. Codes exploit sparse MDS design and vector-linear constructions (Liu et al., 2019).
Secure Decentralized PICOD: Each user decodes only one new message and gains no further information. For circular-shift side information, the decentralized/secure combination can require up to triple the code length of the centralized/secure regime for certain parameters, and infeasibility arises for specific parameter regimes (Liu et al., 2020, Liu et al., 2020).
Federated Learning / Data Shuffling: Coding for decentralized data reshuffling (notably in non-IID federated learning) is naturally modeled as DPIC or CDPIC( $S$ , $K$ ) instances; explicit code constructions transform local data distributions toward IID in a small, provable, number of shuffling rounds, with rigorous bounds and extensive experimental evaluation (Kadakkottiri et al., 1 Jul 2025, Song et al., 2017).

7. Advanced Formulations and Future Directions

Very Pliable Index Coding: Allows decoded indices to depend on message realizations, strictly generalizing standard pliable coding. Linear coding cannot exploit this generalization, but nonlinear very-pliable codes can strictly improve rates at finite blocklength (Ong et al., 2022).
Open Problems: Exact code length for arbitrary instances, tight converse in group-complete for $t \leq (g-1)(m-s)$ , role of nonlinear and probabilistic codes, and design for networked, nonuniform, or distributed architectures remain key challenges (Eghbal et al., 2024).
Connections: Methodologies and tools drawn from combinatorial design, hypergraph theory, coloring, minrank optimization, and network coding—all interacting in a rich, still-growing research landscape.