Cost-Aware LFU for Cloud Caching

Updated 6 April 2026

Cost-aware LFU is a caching policy that integrates access frequency with explicit storage and compute cost models for optimized cloud caching decisions.
The approach uses a per-item threshold (λ > S/C) to make independent, online caching decisions, ensuring provable optimality under steady-state conditions.
Empirical evaluations demonstrate 10–30% cost reductions over traditional LRU and TTL policies, validating its effectiveness in dynamic, pay-per-usage cloud environments.

A cost-aware least-frequently-used (LFU) policy is a caching approach for cloud-based systems that incorporates both access frequency and explicit cloud cost models—specifically storage and compute costs—rather than traditional fixed-capacity constraints. Unlike classical cache replacement algorithms that evict entries based purely on recency or historical frequency, the cost-aware LFU evaluates, for each individual item, whether retaining it in the cache minimizes overall operational cost under the cloud’s pay-per-usage paradigm. The policy is fully decomposable across items, provably optimal under stationary access patterns, and achieves near-optimal performance with practical, online frequency estimation.

1. Problem Setting and Cost Model

The central context is cloud-based caching where data can be either recomputed on-the-fly or served from cloud storage. For each item $i \in \{1,\ldots,N\}$ , the following holds:

Request Arrival: Each $i$ is requested according to a Poisson process at rate $\lambda_{i}$ .
Compute Cost: Recomputing (a miss) costs $C_{i}$ per access.
Storage Cost: Storing item $i$ in the cloud cache costs $S_{i}$ per unit time.
Capacity Assumption: There is no fixed storage limit. Operators pay proportional to cache occupancy and duration, distinct from classical caches dominated by up-front capacity investment.
Transfer Cost: Assumed negligible or folded into $C_{i}$ .

Empirical access frequency $f_{i}$ is measured as the count of requests in a sliding window $W$ divided by $W$ ; under steady-state Poisson arrivals, $i$ 0 as $i$ 1 (Scouarnec et al., 2013).

2. Mathematical Formulation

Let $i$ 2 indicate whether item $i$ 3 is always cached ( $i$ 4) or never cached ( $i$ 5). The optimization objective is the long-run average cost per time unit:

$i$ 6

If cached ( $i$ 7): Pay $i$ 8 per unit time; all accesses are hits.
If not cached ( $i$ 9): Pay recompute cost $\lambda_{i}$ 0 per access at rate $\lambda_{i}$ 1.

The problem is separable across items; minimizing $\lambda_{i}$ 2 reduces to per-item decisions:

$\lambda_{i}$ 3

3. Cost-Aware LFU Rule and Online Algorithm

Define a per-item utility score:

$\lambda_{i}$ 4

$\lambda_{i}$ 5 implies it is cost-effective to cache $\lambda_{i}$ 6; otherwise, evict it.

Equivalently, use the threshold:

If $\lambda_{i}$ 7, set $\lambda_{i}$ 8 (cache indefinitely).
If $\lambda_{i}$ 9, set $C_{i}$ 0 (never cache).

Online implementation uses a sliding window estimator of frequency: $S_{i}$ 1 This “cost-aware LFU” (Editor's term) policy compares observed frequency $C_{i}$ 1 to $C_{i}$ 2 for each item, diverging from classic LFU by making individual, threshold-based caching decisions absent any global capacity constraint (Scouarnec et al., 2013).

4. Theoretical Properties

Optimality: In steady-state and with exact $C_{i}$ 3, the per-item rule yields the minimum expected cost. Each item's decision independently selects the lower-cost strategy: recompute on demand, or always store.
Sliding Window Approximation: Using $C_{i}$ 4 as the plug-in estimate (MLE) for $C_{i}$ 5 results in estimation error proportional to $C_{i}$ 6. As $C_{i}$ 7 increases, empirical performance converges to the optimal (Scouarnec et al., 2013).
Full Decomposition: Absence of cross-item interactions allows the global problem to decompose into $C_{i}$ 8 independent, one-dimensional subproblems.

5. Empirical Evaluation

The policy was evaluated using both synthetic and real-world traces:

Workloads: Synthetic Zipf-distributed items (e.g., 10,000 movies, 5,000 ads; various $C_{i}$ 9) and traces from Netflix (17,000 items, 6 years), YouTube Sci (~252,000 items), and Daum Travel (~9,000 items).
Costs: Simulated with Amazon EC2 and S3 prices—compute $i$ 0 USD per chunk, storage $i$ 1 USD per chunk-hour.
Comparison Policies: Evaluated against global TTL (one TTL shared across all items), a clairvoyant lower-bound (oracle), and LRU under fixed-size constraint.

Measured Metrics:

Total cost per chunk served,
Hit ratio (informative),
Amortized cost per request.

Principal Results:

Cost-aware LFU matches the clairvoyant lower bound closely.
Delivers 10–20% lower cost than global TTL and up to 30% lower than LRU across all request rates.
On real traces, individual TTL (per-item, cost-aware) saves 15% over global TTL and 25% over LRU (Scouarnec et al., 2013).

Policy	Cost Improvement vs. LRU	Cost Improvement vs. Global TTL
Cost-aware LFU	Up to 30%	10–20%
Individual TTL	25% (real traces)	15% (real traces)

6. Parameter Sensitivity and Practical Guidance

Impact of Price Ratios: As $i$ 2 rises (storage more expensive than compute), the critical frequency threshold increases ( $i$ 3), decreasing the cache population. As $i$ 4 falls, more items are retained in cache. This allows adaptable cache sizing without explicit global limits.
Sliding Window Size ( $i$ 5): $i$ 6 must be at least $i$ 7 to resolve frequency estimates near the threshold. Empirical tests found $i$ 8 is optimal; substantially larger $i$ 9 increases estimation variance and adaptation lag under non-stationary access, while smaller $S_{i}$ 0 leads to suboptimal thresholding.

A plausible implication is that this decomposable, adaptive approach can be directly tuned for cost minimization under dynamic pricing models and variable demand, with empirical validation demonstrating robust performance gains over traditional cache replacement schemes.

7. Summary and Significance

The cost-aware LFU policy reformulates cache management for cloud contexts by forgoing the fixed-capacity constraint and integrating explicit cost models with online, per-item access frequency tracking. It admits fully online execution, converges to the per-item optimum as estimator window size grows, adapts naturally to popularity skew and burstiness, and outperforms both size-based (LRU/LFU) and global TTL approaches in terms of realized cost, while nearly matching theoretical lower bounds (Scouarnec et al., 2013). This model is especially salient for cloud systems where elastic resources, item heterogeneity, and fine-grained economics are primary operational considerations.

Markdown Report Issue Upgrade to Chat

References (1)

Cache policies for cloud-based systems: To keep or not to keep (2013)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Cost-Aware Least-Frequently-Used (LFU) Policy.

Cost-Aware LFU for Cloud Caching

1. Problem Setting and Cost Model

2. Mathematical Formulation

3. Cost-Aware LFU Rule and Online Algorithm

4. Theoretical Properties

5. Empirical Evaluation

6. Parameter Sensitivity and Practical Guidance

7. Summary and Significance

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Cost-Aware LFU for Cloud Caching

1. Problem Setting and Cost Model

2. Mathematical Formulation

3. Cost-Aware LFU Rule and Online Algorithm

4. Theoretical Properties

5. Empirical Evaluation

6. Parameter Sensitivity and Practical Guidance

7. Summary and Significance

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research