Papers
Topics
Authors
Recent
Search
2000 character limit reached

Cost-Aware LFU for Cloud Caching

Updated 6 April 2026
  • Cost-aware LFU is a caching policy that integrates access frequency with explicit storage and compute cost models for optimized cloud caching decisions.
  • The approach uses a per-item threshold (λ > S/C) to make independent, online caching decisions, ensuring provable optimality under steady-state conditions.
  • Empirical evaluations demonstrate 10–30% cost reductions over traditional LRU and TTL policies, validating its effectiveness in dynamic, pay-per-usage cloud environments.

A cost-aware least-frequently-used (LFU) policy is a caching approach for cloud-based systems that incorporates both access frequency and explicit cloud cost models—specifically storage and compute costs—rather than traditional fixed-capacity constraints. Unlike classical cache replacement algorithms that evict entries based purely on recency or historical frequency, the cost-aware LFU evaluates, for each individual item, whether retaining it in the cache minimizes overall operational cost under the cloud’s pay-per-usage paradigm. The policy is fully decomposable across items, provably optimal under stationary access patterns, and achieves near-optimal performance with practical, online frequency estimation.

1. Problem Setting and Cost Model

The central context is cloud-based caching where data can be either recomputed on-the-fly or served from cloud storage. For each item i{1,,N}i \in \{1,\ldots,N\}, the following holds:

  • Request Arrival: Each ii is requested according to a Poisson process at rate λi\lambda_{i}.
  • Compute Cost: Recomputing (a miss) costs CiC_{i} per access.
  • Storage Cost: Storing item ii in the cloud cache costs SiS_{i} per unit time.
  • Capacity Assumption: There is no fixed storage limit. Operators pay proportional to cache occupancy and duration, distinct from classical caches dominated by up-front capacity investment.
  • Transfer Cost: Assumed negligible or folded into CiC_{i}.

Empirical access frequency fif_{i} is measured as the count of requests in a sliding window WW divided by WW; under steady-state Poisson arrivals, ii0 as ii1 (Scouarnec et al., 2013).

2. Mathematical Formulation

Let ii2 indicate whether item ii3 is always cached (ii4) or never cached (ii5). The optimization objective is the long-run average cost per time unit:

ii6

  • If cached (ii7): Pay ii8 per unit time; all accesses are hits.
  • If not cached (ii9): Pay recompute cost λi\lambda_{i}0 per access at rate λi\lambda_{i}1.

The problem is separable across items; minimizing λi\lambda_{i}2 reduces to per-item decisions:

λi\lambda_{i}3

3. Cost-Aware LFU Rule and Online Algorithm

Define a per-item utility score:

λi\lambda_{i}4

  • λi\lambda_{i}5 implies it is cost-effective to cache λi\lambda_{i}6; otherwise, evict it.

Equivalently, use the threshold:

  • If λi\lambda_{i}7, set λi\lambda_{i}8 (cache indefinitely).
  • If λi\lambda_{i}9, set CiC_{i}0 (never cache).

Online implementation uses a sliding window estimator of frequency: SiS_{i}1 This “cost-aware LFU” (Editor's term) policy compares observed frequency CiC_{i}1 to CiC_{i}2 for each item, diverging from classic LFU by making individual, threshold-based caching decisions absent any global capacity constraint (Scouarnec et al., 2013).

4. Theoretical Properties

  • Optimality: In steady-state and with exact CiC_{i}3, the per-item rule yields the minimum expected cost. Each item's decision independently selects the lower-cost strategy: recompute on demand, or always store.
  • Sliding Window Approximation: Using CiC_{i}4 as the plug-in estimate (MLE) for CiC_{i}5 results in estimation error proportional to CiC_{i}6. As CiC_{i}7 increases, empirical performance converges to the optimal (Scouarnec et al., 2013).
  • Full Decomposition: Absence of cross-item interactions allows the global problem to decompose into CiC_{i}8 independent, one-dimensional subproblems.

5. Empirical Evaluation

The policy was evaluated using both synthetic and real-world traces:

  • Workloads: Synthetic Zipf-distributed items (e.g., 10,000 movies, 5,000 ads; various CiC_{i}9) and traces from Netflix (17,000 items, 6 years), YouTube Sci (~252,000 items), and Daum Travel (~9,000 items).
  • Costs: Simulated with Amazon EC2 and S3 prices—compute ii0 USD per chunk, storage ii1 USD per chunk-hour.
  • Comparison Policies: Evaluated against global TTL (one TTL shared across all items), a clairvoyant lower-bound (oracle), and LRU under fixed-size constraint.

Measured Metrics:

  • Total cost per chunk served,
  • Hit ratio (informative),
  • Amortized cost per request.

Principal Results:

  • Cost-aware LFU matches the clairvoyant lower bound closely.
  • Delivers 10–20% lower cost than global TTL and up to 30% lower than LRU across all request rates.
  • On real traces, individual TTL (per-item, cost-aware) saves 15% over global TTL and 25% over LRU (Scouarnec et al., 2013).
Policy Cost Improvement vs. LRU Cost Improvement vs. Global TTL
Cost-aware LFU Up to 30% 10–20%
Individual TTL 25% (real traces) 15% (real traces)

6. Parameter Sensitivity and Practical Guidance

  • Impact of Price Ratios: As ii2 rises (storage more expensive than compute), the critical frequency threshold increases (ii3), decreasing the cache population. As ii4 falls, more items are retained in cache. This allows adaptable cache sizing without explicit global limits.
  • Sliding Window Size (ii5): ii6 must be at least ii7 to resolve frequency estimates near the threshold. Empirical tests found ii8 is optimal; substantially larger ii9 increases estimation variance and adaptation lag under non-stationary access, while smaller SiS_{i}0 leads to suboptimal thresholding.

A plausible implication is that this decomposable, adaptive approach can be directly tuned for cost minimization under dynamic pricing models and variable demand, with empirical validation demonstrating robust performance gains over traditional cache replacement schemes.

7. Summary and Significance

The cost-aware LFU policy reformulates cache management for cloud contexts by forgoing the fixed-capacity constraint and integrating explicit cost models with online, per-item access frequency tracking. It admits fully online execution, converges to the per-item optimum as estimator window size grows, adapts naturally to popularity skew and burstiness, and outperforms both size-based (LRU/LFU) and global TTL approaches in terms of realized cost, while nearly matching theoretical lower bounds (Scouarnec et al., 2013). This model is especially salient for cloud systems where elastic resources, item heterogeneity, and fine-grained economics are primary operational considerations.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Cost-Aware Least-Frequently-Used (LFU) Policy.