2000 character limit reached
A Formal Analysis of the Count-Min Sketch with Conservative Updates (2203.14549v1)
Published 28 Mar 2022 in cs.DM, cs.DS, and cs.PF
Abstract: Count-Min Sketch with Conservative Updates (CMS-CU) is a popular algorithm to approximately count items' appearances in a data stream. Despite CMS-CU's widespread adoption, the theoretical analysis of its performance is still wanting because of its inherent difficulty. In this paper, we propose a novel approach to study CMS-CU and derive new upper bounds on the expected value and the CCDF of the estimation error under an i.i.d. request process. Our formulas can be successfully employed to derive improved estimates for the precision of heavy-hitter detection methods and improved configuration rules for CMS-CU. The bounds are evaluated both on synthetic and real traces.