Coded Caching: Principles & Advances

Updated 6 December 2025

Coded caching is a communication-theoretic approach that strategically partitions files and employs coded multicasting to reduce transmission rates.
It optimizes the trade-off between cache memory size and delivery rate by using subfile partitioning and multicast gains across centralized and decentralized networks.
Recent advancements include combinatorial and learning-based methods to lower subpacketization complexity while addressing privacy and security challenges.

Coded caching is a communication-theoretic paradigm that couples local cache memories with coded multicasting, resulting in rate reductions over shared links relative to classical uncoded caching. Unlike traditional schemes, where users store content fragments for direct or unicast-based retrieval, coded caching strategically partitions files into subfiles (or, with file-level coding, linear combinations), enabling the server to simultaneously deliver information to multiple users via coded transmissions, with each user able to extract their requested file by exploiting locally cached content and side information. The concept was introduced in the context of content delivery networks by Maddah-Ali and Niesen and is now central in the paper of cache-aided networks, including both centralized and decentralized placement, networks with helper architectures, and privacy- and security-enhanced variants.

1. System Model and Canonical Formulation

Coded caching is conventionally studied in a network comprised of a single server storing a library of $N$ independent files $(W_1,\ldots,W_N)$ , each of size $F$ bits, and $K$ users each equipped with a local cache of size $MF$ bits (normalized cache size $M$ ). The server communicates with all users via an error-free shared broadcast link. Operation takes place in two phases:

Placement phase (prefetching): Caches are filled as a function of all files, but without knowledge of users’ future demands.
Delivery phase: Each user requests a file; the server transmits a common message $X$ over the broadcast link so that every user can reconstruct its requested file using its cache and the transmitted message.

With subpacketization, files are partitioned into possibly many subfiles, enabling coded multicast opportunities. Each user’s cache content $Z_k$ (or coded combinations thereof) is optimized to induce maximal coding opportunities in the delivery phase, often matching the combinatorial structure of the cache placement.

The key trade-off is the memory-rate curve, relating $M$ to the minimum required delivery rate $R$ (normalized number of file-length transmissions) ensuring with high probability (as $F\to\infty$ ) that each user retrieves its request.

2. Centralized and Decentralized Coded Caching

Centralized Coded Caching

The classical Maddah-Ali–Niesen (MAN) centralized coded caching scheme assumes the identities of the $K$ users are fixed at placement, and implements a combinatorial uncoded placement such that each file is divided into $\binom{K}{t}$ subfiles, with $t=KM/N$ . User $k$ caches every subfile whose index set includes $k$ . In the delivery phase, the server multicasts XORs of subfiles indexed by $t+1$ -element subsets, achieving the rate

$R(M) = \frac{K(1 - M/N)}{1 + KM/N},$

with subpacketization $F_{\mathrm{MAN}} = \binom{K}{KM/N}$ (Ravindrakumar et al., 2016).

Decentralized Coded Caching

Decentralized settings, where the active user set is not known at placement, employ random caching: each user independently caches a fraction $M/N$ of each file. The resulting delivery phase requires more intricate scheme design, but can approach the same multicasting gain asymptotically as $K$ and $N$ grow. Group-based decentralized schemes further exploit repeated file requests and user grouping for improved rate-memory trade-offs, especially when $K > N$ (Amiri et al., 2016).

3. Rate-Subpacketization Trade-offs and Code Constructions

Achieving the theoretical benefits of coded caching typically requires high subpacketization (large $F$ ). This has motivated several research directions aimed at reducing subpacketization while retaining most coding gains:

Resolvable and block design-based schemes: By interpreting cache placement as resolvable design construction, where parallel classes and block intersection guarantee unique decodability, practical schemes with polynomial (rather than exponential) subpacketization can be constructed, at the cost of a slight rate increase (Tang et al., 2016, Tang et al., 2017). For uniform cache sizes $M/N=1/q$ , subpacketization can be reduced from $F_{\mathrm{MAN}}$ to $F = q^{k-1}$ for $K = qk$ users.
Combinatorial designs and PDAs: Placement Delivery Arrays (PDAs) and combinatorial design-based schemes unify many constructions and yield schemes with subpacketization $F = O(\mathrm{poly}(K))$ and rates $R = O(1)$ or even $O(1/K)$ , though typically requiring $M/N$ to vanish with growing $K$ (Agrawal et al., 2019).

Scheme Class	Subpacketization $F$	Achievable Rate $R$
MAN (centralized)	$\binom{K}{KM/N}$	$K(1-M/N)/(1+KM/N)$
Design-based, $M/N=1/q$	$q^{k-1}$	$q-1$
Combinatorial design (BIBD, TD, etc.)	$O(K^i)$ , $i=1,2,3$	$O(1)$ or $O(1/K)$
Shared Cache (SC-CC) matrix/design-based	fixed $q^m$	algorithmic, $R(M)$ computed

These reductions are critical for making coded caching implementable at moderate-to-large $K$ , as MAN subpacketization is infeasible for large systems.

4. Generalized and Extended Models

Private and Secure Coded Caching

The coded caching model has been extended to support user-privacy constraints, requiring that no user gain information about files it did not request (information-theoretic secrecy), via secret sharing and one-time pad masking. A canonical construction uses a $(\binom{K}{t},\binom{K-1}{t-1})$ secret-sharing scheme for each file, injects random keys into the cache placement, and masks server transmissions with fresh one-time pads. The resulting tradeoff curve

$R_C(M)=\operatorname{conv}\left\{\left(M, R=\frac{K(N+M-1)}{N+(K+1)(M-1)}\right): M=1+\frac{Nt}{K-t}\right\},$

is order-optimal within a factor 16 of the secrecy-constrained cut-set bound (Ravindrakumar et al., 2016). For demand privacy and wiretap security, the Secure Placement Delivery Array (SPDA) framework yields order-optimal constructions with rates and subpacketization levels independent of the library size $N$ (Cheng et al., 2020).

File-Level and Small Subpacketization Coded Caching

Alternate strategies explore coded file-level caching: each cache stores entire (unsplit) files, and file-level XORs are used in the broadcast phase. While this forgoes the full gain of MAN, it significantly simplifies implementation and, via greedy clique covers or matching algorithms over side-information graphs, captures a large fraction of the additive coded-multicasting gain, especially for moderate $M/N$ . Extending to subfile partitioning with a small number $\Delta$ of subfiles, nearly all the gain of full subpacketization can be achieved (Saberali et al., 2017).

Multi-Library and Multi-Level Popularity

Coded caching frameworks have been generalized to settings with multiple file libraries, where users make independent requests from each library. When all libraries have the same cardinality, optimality is achieved by memory-sharing—splitting each cache proportional to library size and applying MAN independently. Coding across libraries offers no further gain in the uniform case. In multi-level popularity models, memory-sharing and clustering approaches establish order-optimal solutions, with new entropy-based lower bounds developed for multi-user, multi-level profiles (Sahraei et al., 2016, Hachem et al., 2014).

Dynamic, Asynchronous, and Cooperative Coded Caching

Recent work addresses non-static or dynamic user populations, time-varying user arrivals, and asynchrony in requests and deadlines. In dynamic settings with fixed and mobile users, concatenating-based placement and graph-based saturating-matching delivery ensures minimal placement updates while retaining multicast gains (Zhang et al., 2019). For asynchronous caching with heterogeneous deadlines, offline LP and online heuristic solutions leveraging all-but-one index coding maintain high coding gains under mild asynchronism, with feasibility guarantees (Ghasemi et al., 2019). User cooperation and parallel transmission opportunities further decrease delivery delay; the associated gains are quantified and order-optimality within constant factors is established for both centralized and decentralized settings (Chen et al., 2020).

Coded Placement and Linear Function Retrieval

Coded placement (where cache contents are linear combinations of files) can provide additional gains, particularly in the small-memory regime ( $M\leq N/K$ ), as shown by schemes employing $t$ -sum linear placement and multi-stage delivery over small finite fields (Ma et al., 27 Apr 2024). Scalar Linear Function Retrieval (SLFR) extends coded caching to functional requests; general linear coding schemes achieve optimal load using combinatorial and algebraic structure, often with the solution reducible to a constrained spanning tree in a bipartite graph encoding cycle constraints among encoding coefficients (Ma et al., 2021).

5. Network Topologies, Shared Caches, and Helper Architectures

The extension of coded caching to networks with shared helper caches (SC-CC) and multilayer or relay-based architectures demands new combinatorial and algebraic constructions. In SC-CC, users access helper caches with general user-to-cache association profiles; both subpacketization and rate can be controlled via the use of matroid circuits, placement delivery arrays (PDAs), and design-theoretic methods (Das et al., 2022, Peter et al., 2021). These approaches achieve rates close to known optimal shared-cache schemes, with the subpacketization level decoupled from the number of users or caches, supporting system scalability and incremental cache addition.

For two-hop or relay topologies, if the network satisfies a resolvability property (the user-relay incidence matrix admits a partition into parallel classes), uniform coded placement and coded multicasting across parallel classes exploit symmetry to strictly improve upon prior schemes in both server-to-relay and relay-to-user rates, with reduced subpacketization (Tang et al., 2016). The theoretical framework accommodates, as special cases, combination networks and affine-plane constructions, and raises further questions regarding extending these gains to arbitrary topologies.

6. Algorithmic and Learning-Theoretic Methods

With the high combinatorial complexity of scheduling coded transmissions in arbitrary or partially structured cache contents, learning-based approaches have been adopted. Deep reinforcement learning (actor-critic networks) can, with polynomial complexity, learn delivery policies matching or exceeding known (NP-hard) approximations, by directly interacting with the system and optimizing reward functions based on delivery delay (NaderiAlizadeh et al., 2019). These methods support online adaptation to arbitrary cache states and request profiles, and open avenues for coded caching in practical, evolving, or hard-to-code settings.

7. Key Insights and Open Problems

The coded caching field is characterized by a combination of strong information-theoretic guarantees (multicast gain, order-optimality), sharp combinatorial structures (block designs, PDAs, matroids, secret sharing), and performance-complexity trade-offs (rate vs subpacketization, coding complexity vs gain). Open problems include optimality in non-uniform popularity and access, scaling of subpacketization for large $K$ , optimality of memory-sharing in multicomponent systems, robust coded caching in non-symmetric or time-varying networks, and tighter integration of coded caching into practical content-delivery infrastructures.

This ongoing research area continues to bridge network coding, combinatorial design, and distributed storage, establishing coded caching as a foundational component of next-generation information and communication systems.