StickySampling: Streaming Frequency Estimation

Updated 16 November 2025

StickySampling is a streaming algorithm that approximates item frequency counts in high-speed data streams with provable one-sided additive error guarantees.
It employs a decreasing Bernoulli sampling probability combined with periodic counter decay to achieve logarithmic space complexity relative to the failure probability.
The algorithm is applied in security-critical contexts such as DRAM RowHammer mitigation, ensuring efficient detection of hammer rows without false positives.

StickySampling is a streaming algorithm designed to maintain approximate frequency counts for items in a high-speed data stream, providing one-sided additive error guarantees using space that is logarithmic in the failure probability. Originally formulated for data streams by Manku and Motwani, StickySampling achieves strong probabilistic security and performance trade-offs, making it particularly suitable for security-critical systems such as @@@@1@@@@ RowHammer mitigation, where it enables the detection of “hammer” rows with provable guarantees.

1. Problem Statement and Formal Definition

The StickySampling algorithm addresses the problem of tracking item (e.g., memory row) frequencies over a potentially unbounded data stream $S$ of length $N$ . For each unique element $a$ , let $\operatorname{real}(a)$ denote the true frequency and $\operatorname{est}(a)$ the estimate reported by the algorithm. The algorithm maintains a data structure $C$ with the following property:

$\operatorname{real}(a) - \epsilon N \leq \operatorname{est}(a) \leq \operatorname{real}(a),\quad \text{with probability at least } 1-\delta$

where $\epsilon \in (0,1)$ is the permissible additive error fraction and $\delta \in (0,1)$ is the failure probability on the upper bound. In practice, parameters are tuned so that $\epsilon N$ is a small fraction of a relevant system threshold (such as RowHammer, RH), and $\delta$ is the tolerable false-negative rate.

Key parameters:

$\epsilon$ : additive error fraction (e.g., set so $\epsilon N \approx \mathrm{RH}/4$ in DRAM applications)
$\delta$ : failure probability
$t=\left\lceil \frac{1}{\epsilon}\ln \frac{1}{2\epsilon\delta} \right\rceil$ : support-width constant, controlling counter compression frequency
$\text{window\_width}=2t$ : updates before each compression and halving of sampling probability

StickySampling combines a geometrically decreasing Bernoulli sampling probability with periodical counter decay (“Compress”) to bound the number of stored counters.

2. Algorithm: Annotated Pseudocode

A hardware-friendly version of the StickySampling algorithm maintains accuracy and memory efficiency:

procedure StickySampling(ε, δ)
  processed ← 0
  t ← ceil((1/ε) * ln(1/(2εδ)))
  window_width ← 2 * t
  P_sample ← 1.0      // initial sampling probability
  C ← empty map<row_address → count>

  for each activation S[i] do
    processed ← processed + 1

    // UPDATE: Frequency Count Maintenance
    if S[i] in C then
      C[S[i]] ← C[S[i]] + 1
    else
      r ← uniform_random(0,1)
      if r ≤ P_sample then
        C[S[i]] ← 1
      end if
    end if

    // COMPRESS: Counter Decay and Window Update
    if processed = window_width then
      for each x in C do
        tails ← 0
        repeat
          if coin_flip() == tails then
            tails ← tails + 1
        until coin_flip() == heads
        C[x] ← C[x] - tails
        if C[x] ≤ 0 then
          remove x from C
      end for

      window_width ← 2 * window_width
      P_sample ← P_sample / 2
      processed ← 0
    end if
  end for

  return C // at any time, C[x] is est(x)

This structure admits new items with a decreasing probability, ensuring rare items are dropped over time. The Compress step uses geometric decay to cap state and avoids linear growth over the stream.

3. Accuracy and Space Complexity Guarantees

Let $N$ be the total number of processed items. Under the specified parameters:

The counter table maintains at most $M = \left\lceil \frac{1}{\epsilon} \ln \frac{1}{2\epsilon\delta} \right\rceil$ entries.
For any row address $a$ $a$ :
- Deterministic lower bound: $\operatorname{est}(a) \leq \operatorname{real}(a)$ .
- Probabilistic upper bound:
$\Pr[\operatorname{real}(a) - \epsilon N \leq \operatorname{est}(a) \leq \operatorname{real}(a)] \geq 1 - \delta.$

Space usage is thus

$|C| \leq O\left(\frac{1}{\epsilon}\ln\frac{1}{\epsilon\delta}\right)$

with each entry recording a row address and its partial count.

4. Security Guarantees for RowHammer Mitigation

For DRAM RowHammer detection, “critical” rows (potential aggressors) are defined as $\operatorname{real}(a) > \mathrm{RH}$ within a refresh window. To guarantee detection,

Set $\epsilon$ so that $\epsilon N < \mathrm{RH}$ .
Any row with $\operatorname{real}(a) \geq \mathrm{RH} + \epsilon N$ will have $\operatorname{est}(a) \geq \mathrm{RH}$ with probability at least $1-\delta$ , triggering mitigation.
No row with $\operatorname{real}(a) < \mathrm{RH} - \epsilon N$ will be falsely reported: no false positives. With this, all rows exceeding the hammer threshold are detected and mitigated with high confidence before causing victim bitflips.

5. Comparison with Reservoir Sampling and Lossy Counting

Algorithm	Space Complexity	Error Profile
Reservoir Sampling	$O(k)$ ( $k \sim N/\mathrm{RH}$ )	Probabilistic (detection by sampling)
Lossy Counting	$O\left(\frac{1}{\epsilon}\log(\epsilon N)\right)$	One-sided, deterministic lower bound
StickySampling	$O\left(\frac{1}{\epsilon}\log\frac{1}{\epsilon\delta}\right)$	One-sided additive $\epsilon N$ error with failure $\leq \delta$

Reservoir Sampling provides uniform sampling but does not yield frequency counts, and is relatively inefficient for high security as $k$ scales steeply. Lossy Counting provides one-sided error but its counter state grows with $\log(\epsilon N)$ . StickySampling achieves similar error guarantees to Lossy Counting, with superior scaling—its state does not depend on the total stream length $N$ , only on $\epsilon$ and $\delta$ .

6. Parameter Selection and Practical Deployment in DRAM Controllers

In practical DRAM systems with $\text{tREFW} = 32$ ms, $\text{tRC} = 48$ ns, and worst-case $N_{\max} \sim 666,000$ activations per window, set $\mathrm{RH} = 4,000$ (RowHammer threshold). To ensure $\epsilon N_{\max} = \mathrm{RH}/4$ , select $\epsilon = 1.5 \times 10^{-3}$ . With $\delta = 10^{-3}$ , this yields:

$t = \left\lceil (1/\epsilon) \ln(1/(2\epsilon\delta)) \right\rceil \approx 8,466$
$\text{window\_width} = 2t \approx 17,000$ activations

The resulting counter table holds $\leq 8,466$ entries. After every $17,000$ activations, the Compress step is triggered, halving $P_\text{sample}$ and doubling the next window. Such resource demands are moderate relative to DRAM controller capabilities and allow designer-controlled trade-offs by tuning $\epsilon$ and $\delta$ ; lowering $\epsilon$ increases tracking fidelity but raises memory usage, while decreasing $\delta$ reduces false negatives with only logarithmic space cost.

7. Significance and Applicability

StickySampling introduces a novel combination of provable one-sided additive error ( $\epsilon N$ ) and logarithmic-in- $1/\delta$ space, enabled by the geometric decay and window-doubling mechanism. It is the first streaming method to provide these guarantees within the domain of architectural RowHammer defenses, ensuring the detection of all rows surpassing the hammer threshold with tunable confidence while avoiding false positives. The algorithm’s balanced security-performance trade-off surpasses both pure sampling and deterministic bucket-based schemes for this class of memory security problems. Practitioners should select $\epsilon = (\mathrm{RH}/4)/N_{\max}$ and $\delta$ to match system-level false-negative requirements, thereby right-sizing counter table, update window, and sampling probability to ensure resilient mitigation against aggressive RowHammer attacks.

Markdown Upgrade to Chat

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to StickySampling.