Non-LRU Replacement Policies

Updated 7 December 2025

Non-LRU replacement policies are cache management strategies that avoid strict recency ordering to address scan vulnerabilities and shifting access patterns.
They integrate adaptive methods, frequency and utility metrics, and even deep learning approaches to reduce miss rates by up to 29% compared to classical LRU.
Future directions focus on multi-tier optimization, online refinement, and hardware-software co-design to further improve performance and scalability.

Non-LRU (Least Recently Used) replacement policies refer to a broad class of cache management algorithms that avoid the strict recency orderings used by classical LRU. These policies have emerged in response to demonstrated weaknesses of LRU, such as non-resilience to scan workloads, poor adaptation to dynamically shifting access distributions, lack of cost awareness, and hardware scalability challenges. Non-LRU policies are diverse, spanning adaptive hybrid schemes, frequency- and utility-driven strategies, deep-learning approaches, hardware-specific pseudo-LRU constructions, and domain-specific algorithms addressing data dependency or persistent memory constraints.

1. Foundations and Shortcomings of LRU

Classical LRU tracks the temporal recency of blocks and evicts the least recently referenced block upon a miss. It suffers from several structural limitations:

Scan vulnerability: In workloads exhibiting scans (one-time accesses to a set of blocks), LRU can evict frequently used items, as scanned blocks rapidly become "least recently used" and push out valuable data.
Unidimensionality: LRU considers only recency, ignoring frequency and future utility, resulting in suboptimal decisions under complex or mixed workload patterns.
Hardware complexity: True LRU per set requires $O(k \log k)$ bits for associativity $k$ and move-to-front operations on every access. These limitations have motivated the development of non-LRU policies that offer better adaptation, asymptotic guarantees, or empirical performance in diverse environments (Consuegra et al., 2015).

2. Adaptive and Hybrid Replacement Schemes

Several non-LRU policies integrate both recency and frequency information, often employing dynamic or online tuning:

ARC and CAR: Adaptive Replacement Cache and CLOCK with Adaptive Replacement partition the cache into recency and frequency pools and employ ghost lists to track recently evicted items. An adaptive parameter $p$ controls the recency-frequency boundary. On access or history "ghost" hits, $p$ is nudged to emphasize the observed dominant access pattern. ARC organizes its pools as LRU lists, while CAR employs CLOCK-style bit-marked circular buffers. These policies are $O(N)$ -competitive with OPT, and empirically reduce miss rates by 10–30% versus LRU on real-world workloads (Consuegra et al., 2015).

DynamicAdaptiveClimb: This recently introduced policy maintains a single "jump" parameter reflecting the typical promotion distance; frequent hits decrease jump (CLIMB-like), while misses increase jump (LRU-like), allowing rapid shifting between aggressive and conservative promotion regimes. An extension, DynamicAdaptiveClimb, adds cache resizing logic that doubles or halves the cache size by detecting sustained patterns in hit/miss rates, all through O(1) metadata and instructions per operation. This technique outperforms FIFO, LRU, ARC, and SIEVE by 10–15% in highly dynamic workloads and by up to 29% over FIFO for representative key-value and CDN traces (Berend et al., 26 Nov 2025).

AWRP: The Adaptive Weight Ranking Policy maintains, per block, a weight $W_i = F_i / (N - R_i)$ , where $F_i$ is access frequency, $R_i$ the last access time, and $N$ the global reference counter, evicting items with the lowest weight. This balances both recency and frequency, yielding improved hit ratios over LRU and CAR at the cost of $O(C)$ victim selection (Swain et al., 2011).

Scored LRU/LFU hybrids: Some policies (e.g., for video prefix caches) split cache into recency and frequency lists, use scoring that combines access count and last-access time, and employ ghost lists for online adaptation between workload phases (Jayarekha et al., 2010).

3. Learning-Based and Belady-Inspired Policies

Deep learning and imitation learning have recently produced non-LRU cache policies that more closely approximate Belady's MIN (optimal replacement):

Parrot: Learns a cache replacement strategy by supervised imitation of Belady's optimal decision, using past-access windows, cache state, and address/PC embeddings as input to a neural policy network with LSTM and attention modules. Training leverages DAgger to avoid drift, and the learned policy achieves up to 16% absolute hit-rate gains on SPEC workloads and 61% improvement over LRU on web-scale traces (Liu et al., 2020).

MUSTACHE: Treats page access prediction as a categorical time-series forecasting problem and drives multistep prediction (next $k$ accesses) via LSTM; the predicted future requests guide eviction decisions, falling back to LRU in ambiguity. MUSTACHE yields hit rates of 92.5% vs. 90.8% (LRU) and reduces I/O ops 18.4% (reads) and 10.3% (writes), covering half the gap to Belady's OPT on real workloads (Tolomei et al., 2022).

Expected Hit Count (EHC): Identifies strong intra-region correlation in hit counts under Belady's MIN and introduces a region-based expected-hit-count indicator to supplement undecided Belady-inspired replacement (e.g., Hawkeye). EHC requires just 3 tag bits and a compact auxiliary table, reducing MPKI over LRU by 17.5%, outperforming Hawkeye or DRRIP (Ghahani et al., 2018).

4. Specialty Non-LRU Policies

Special-purpose environments or hardware require tailored policies:

Dependency-aware (LRC): For data analytics clusters (e.g., Spark) with explicit job DAGs, blocks are evicted based on the count of downstream uncomputed dependents ("reference count") instead of access recency. This enables immediate eviction of blocks with no future utility, offering up to 60% speedups and up to 60% memory savings over LRU in cluster benchmarks (Yu et al., 2017).

Hybrid SRAM/PCM (DFB): In hybrid caches with SRAM (fast) and PCM (slow), DFB (Dead Fast Block) prioritizes writes and hot blocks into SRAM and evicts "dead" SRAM lines (whose recency drops below a threshold $Z$ ) early, shielding PCM from write load. This yields up to 6.9× lifetime extension, +36% IPC, and energy reductions compared to LRU on PCM (Mittal, 2013).

Utility-Optimized: Utility-based policies assign each content $i$ a concave utility function $U_i(h_i)$ (function of hit probability $h_i$ ) and seek to maximize $\sum_i U_i(h_i)$ under cache size or cost constraints. TTL-based implementations can reproduce LRU, FIFO, fair policies, or arbitrary weighted objectives by solving simple KKT/Lagrangian equations and operate via local on-hit/on-miss adjustments to TTL timers (Dehghan et al., 2016).

Ranking/Cost-based: For VoD caches, objects are ranked via composite metrics accounting for age, access frequency, object size, transfer cost, and observed Zipf-like popularity. Eviction selects the object with the minimal rank value, outperforming LRU, LFU, and Greedy Dual on byte-hit and latency metrics (Nair et al., 2010).

5. Non-LRU Hardware-Friendly and Pseudo-LRU Schemes

Performance and energy constraints in hardware have prompted non-LRU policies with low overhead:

Intel Quad-Age Policy: Modern Intel CPUs use a 2-bit per-line "quad-age" approximation. Hits decrement their line's age, but only the accessed line is updated, and new fills are inserted with an age older than MRU. This is not true LRU and is vulnerable to side-channel exploitation, e.g., via RELOAD+REFRESH which bypasses L3-miss-based detection mechanisms (Briongos et al., 2019).

Randomized Cache Policies: In security-critical settings, LRU's stateful ordering is prohibitively expensive to implement. Alternatives include:

RRP (random replacement): stateless, simple, but high miss rates and security weaknesses.
DRPLRU/FRPLRU: age-tracking via 2 bits in combination with dynamic or fixed recency orderings, modest overhead and improved security.
VARP-64: per-line age with depth $m=64$ , offering tunable trade-off between near-LRU performance and side-channel resistance, but at increased state overhead (Peters et al., 2023). Proper age counter configuration significantly increases the complexity for attackers and achieves near-LRU miss rates.

6. Theoretical Properties and Complexity

Non-LRU policies pose different computational properties regarding static cache analysis:

LRU is NP-complete for arbitrary control-flow, but can be efficiently analyzed for fixed associativity.
Non-LRU policies like FIFO, PLRU, pseudo-RR, and NMRU yield PSPACE-complete analysis problems when program CFGs are cyclic. This intractability explains the preference for LRU in timing-sensitive or real-time systems (Monniaux et al., 2018).
ARC and CAR have proven competitiveness bounds ($4N$ for ARC, $18N$ for CAR) and avoid unbounded pathological sequences (Consuegra et al., 2015).

7. Performance, Scalability, and Domain Trade-offs

Policy	Domain	Metadata	Miss rate†	Hardware complexity	Adaptivity
LRU	General	$O(\log k)$ per line	base	expensive MTF	None
ARC	FS/DB/workloads	ghost lists, LRU lists	–10–30%	moderate	Self-tuning $p$
CAR	General	CLOCK, mark bits	–10–30%	low	Self-tuning $p$
AWRP	General	$F_i$ , $R_i$ , $N$	+0–17pp	$O(C)$ on miss	None
AdaptiveClimb	Dynamic cloud/CDN	1–2 scalars	–10–29%	$O(1)$	Fast drift, resize
DFB	SRAM/PCM hybrid	LRU-order, thresh $Z$	+36% IPC	low	Write-aware
Parrot	General	DNN (LSTM+attn)	+16–61pp	high (for now)	Trace-dependent
MUSTACHE	OS page cache	LSTM (20M param)	+1.7%	moderate to high	Online retrain
EHC	LLC (CPU)	3 bits + region table	+5.2% IPC	low	Region hit-corr
VARP-64	Security, random caches	6 bits/line	+1%	moderate	Tunable

† Relative to LRU on representative workloads.

Notably, non-LRU policies systematically outperform or outperform LRU—especially as the diversity or volatility of access patterns increases, working set size fluctuates, or side-channel adversarial security is important. Memory and computational overheads are tractable in modern hardware except for some neural policies, though further compression or hardware acceleration for DNN-based policies is a prominent future research direction (Liu et al., 2020, Peters et al., 2023).

8. Limitations and Future Directions

Non-LRU policies may require domain knowledge (LRC needs DAGs), be susceptible to novelty distribution drift (must periodically retrain DNNs), or face corner-case oscillation (AdaptiveClimb under pathological workloads). The structure of their internal indicators (expected hit count, region window, utility weights) directly limits their predictive power. Future advances will likely target:

Multi-tier, multi-objective cache optimization (e.g., joint replacement and prefetch).
Compressed learned or hardware-accelerated neural policies deployable at line rate.
Integration with application and dataflow metadata (as in LRC) for expert-guided policies.
Security-driven design in randomized or adversarial settings, balancing miss rate, area, and side-channel resilience.
Online, adaptive refinement or self-tuning beyond fixed parameters for evolving workloads.

Non-LRU policies constitute a vibrant research area, now spanning foundational algorithmics, systems deployment, hardware-software codesign, and security. Rigorous competitiveness, practical performance, and emerging learned schemes are collectively advancing the boundaries well beyond classic LRU.