StickySampling: Streaming Frequency Estimation
- StickySampling is a streaming algorithm that approximates item frequency counts in high-speed data streams with provable one-sided additive error guarantees.
- It employs a decreasing Bernoulli sampling probability combined with periodic counter decay to achieve logarithmic space complexity relative to the failure probability.
- The algorithm is applied in security-critical contexts such as DRAM RowHammer mitigation, ensuring efficient detection of hammer rows without false positives.
StickySampling is a streaming algorithm designed to maintain approximate frequency counts for items in a high-speed data stream, providing one-sided additive error guarantees using space that is logarithmic in the failure probability. Originally formulated for data streams by Manku and Motwani, StickySampling achieves strong probabilistic security and performance trade-offs, making it particularly suitable for security-critical systems such as DRAM RowHammer mitigation, where it enables the detection of “hammer” rows with provable guarantees.
1. Problem Statement and Formal Definition
The StickySampling algorithm addresses the problem of tracking item (e.g., memory row) frequencies over a potentially unbounded data stream of length . For each unique element , let denote the true frequency and the estimate reported by the algorithm. The algorithm maintains a data structure with the following property:
where is the permissible additive error fraction and is the failure probability on the upper bound. In practice, parameters are tuned so that is a small fraction of a relevant system threshold (such as RowHammer, RH), and is the tolerable false-negative rate.
Key parameters:
- : additive error fraction (e.g., set so in DRAM applications)
- : failure probability
- : support-width constant, controlling counter compression frequency
- : updates before each compression and halving of sampling probability
StickySampling combines a geometrically decreasing Bernoulli sampling probability with periodical counter decay (“Compress”) to bound the number of stored counters.
2. Algorithm: Annotated Pseudocode
A hardware-friendly version of the StickySampling algorithm maintains accuracy and memory efficiency:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
procedure StickySampling(ε, δ)
processed ← 0
t ← ceil((1/ε) * ln(1/(2εδ)))
window_width ← 2 * t
P_sample ← 1.0 // initial sampling probability
C ← empty map<row_address → count>
for each activation S[i] do
processed ← processed + 1
// UPDATE: Frequency Count Maintenance
if S[i] in C then
C[S[i]] ← C[S[i]] + 1
else
r ← uniform_random(0,1)
if r ≤ P_sample then
C[S[i]] ← 1
end if
end if
// COMPRESS: Counter Decay and Window Update
if processed = window_width then
for each x in C do
tails ← 0
repeat
if coin_flip() == tails then
tails ← tails + 1
until coin_flip() == heads
C[x] ← C[x] - tails
if C[x] ≤ 0 then
remove x from C
end for
window_width ← 2 * window_width
P_sample ← P_sample / 2
processed ← 0
end if
end for
return C // at any time, C[x] is est(x) |
3. Accuracy and Space Complexity Guarantees
Let be the total number of processed items. Under the specified parameters:
- The counter table maintains at most entries.
- For any row address :
- Deterministic lower bound: .
- Probabilistic upper bound:
Space usage is thus
with each entry recording a row address and its partial count.
4. Security Guarantees for RowHammer Mitigation
For DRAM RowHammer detection, “critical” rows (potential aggressors) are defined as within a refresh window. To guarantee detection,
- Set so that .
- Any row with will have with probability at least , triggering mitigation.
- No row with will be falsely reported: no false positives. With this, all rows exceeding the hammer threshold are detected and mitigated with high confidence before causing victim bitflips.
5. Comparison with Reservoir Sampling and Lossy Counting
| Algorithm | Space Complexity | Error Profile |
|---|---|---|
| Reservoir Sampling | () | Probabilistic (detection by sampling) |
| Lossy Counting | One-sided, deterministic lower bound | |
| StickySampling | One-sided additive error with failure |
Reservoir Sampling provides uniform sampling but does not yield frequency counts, and is relatively inefficient for high security as scales steeply. Lossy Counting provides one-sided error but its counter state grows with . StickySampling achieves similar error guarantees to Lossy Counting, with superior scaling—its state does not depend on the total stream length , only on and .
6. Parameter Selection and Practical Deployment in DRAM Controllers
In practical DRAM systems with ms, ns, and worst-case activations per window, set (RowHammer threshold). To ensure , select . With , this yields:
- activations
The resulting counter table holds entries. After every $17,000$ activations, the Compress step is triggered, halving and doubling the next window. Such resource demands are moderate relative to DRAM controller capabilities and allow designer-controlled trade-offs by tuning and ; lowering increases tracking fidelity but raises memory usage, while decreasing reduces false negatives with only logarithmic space cost.
7. Significance and Applicability
StickySampling introduces a novel combination of provable one-sided additive error () and logarithmic-in- space, enabled by the geometric decay and window-doubling mechanism. It is the first streaming method to provide these guarantees within the domain of architectural RowHammer defenses, ensuring the detection of all rows surpassing the hammer threshold with tunable confidence while avoiding false positives. The algorithm’s balanced security-performance trade-off surpasses both pure sampling and deterministic bucket-based schemes for this class of memory security problems. Practitioners should select and to match system-level false-negative requirements, thereby right-sizing counter table, update window, and sampling probability to ensure resilient mitigation against aggressive RowHammer attacks.