Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

HyperLogLog (HLL) Security: Inflating Cardinality Estimates (2011.10355v1)

Published 20 Nov 2020 in cs.CR and cs.DS

Abstract: Counting the number of distinct elements on a set is needed in many applications, for example to track the number of unique users in Internet services or the number of distinct flows on a network. In many cases, an estimate rather than the exact value is sufficient and thus many algorithms for cardinality estimation that significantly reduce the memory and computation requirements have been proposed. Among them, Hyperloglog has been widely adopted in both software and hardware implementations. The security of Hyperloglog has been recently studied showing that an attacker can create a set of elements that produces a cardinality estimate that is much smaller than the real cardinality of the set. This set can be used for example to evade detection systems that use Hyperloglog. In this paper, the security of Hyperloglog is considered from the opposite angle: the attacker wants to create a small set that when inserted on the Hyperloglog produces a large cardinality estimate. This set can be used to trigger false alarms in detection systems that use Hyperloglog but more interestingly, it can be potentially used to inflate the visits to websites or the number of hits of online advertisements. Our analysis shows that an attacker can create a set with a number of elements equal to the number of registers used in the Hyperloglog implementation that produces any arbitrary cardinality estimate. This has been validated in two commercial implementations of Hyperloglog: Presto and Redis. Based on those results, we also consider the protection of Hyperloglog against such an attack.

Citations (2)

Summary

We haven't generated a summary for this paper yet.