Count-Min Tree Sketch: Approximate counting for NLP (1604.05492v3)

Published 19 Apr 2016 in cs.IR

Abstract: The Count-Min Sketch is a widely adopted structure for approximate event counting in large scale processing. In a previous work we improved the original version of the Count-Min-Sketch (CMS) with conservative update using approximate counters instead of linear counters. These structures are computationaly efficient and improve the average relative error (ARE) of a CMS at constant memory footprint. These improvements are well suited for NLP tasks, in which one is interested by the low-frequency items. However, if Log counters allow to improve ARE, they produce a residual error due to the approximation. In this paper, we propose the Count-Min Tree Sketch (Copyright 2016 eXenSa. All rights reserved) variant with pyramidal counters, which are focused toward taking advantage of the Zipfian distribution of text data.

Citations (3)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Count-Min Tree Sketch: Approximate counting for NLP (1604.05492v3)

Summary

Related Papers