Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Technical Report: Modeling Average False Positive Rates of Recycling Bloom Filters (2401.02647v2)

Published 5 Jan 2024 in cs.DS

Abstract: Bloom Filters are a space-efficient data structure used for the testing of membership in a set that errs only in the False Positive direction. However, the standard analysis that measures this False Positive rate provides a form of worst case bound that is both overly conservative for the majority of network applications that utilize Bloom Filters, and reduces accuracy by not taking into account the actual state (number of bits set) of the Bloom Filter after each arrival. In this paper, we more accurately characterize the False Positive dynamics of Bloom Filters as they are commonly used in networking applications. In particular, network applications often utilize a Bloom Filter that "recycles": it repeatedly fills, and upon reaching a certain level of saturation, empties and fills again. In this context, it makes more sense to evaluate performance using the average False Positive rate instead of the worst case bound. We show how to efficiently compute the average False Positive rate of recycling Bloom Filter variants via renewal and Markov models. We apply our models to both the standard Bloom Filter and a "two-phase" variant, verify the accuracy of our model with simulations, and find that the previous analysis' worst-case formulation leads to up to a 30\% reduction in the efficiency of Bloom Filter when applied in network applications, while two-phase overhead diminishes as the needed False Positive rate is tightened.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. https://github.com/kadzier/Recycling-Bloom-Filters-False-Positives.
  2. Don’t thrash: How to cache your hash on flash. Proc. VLDB Endow., 5(11):1627–1637, jul 2012.
  3. B. H. Bloom. Space/time trade-offs in hash coding with allowable errors. Communications of the ACM, 13(7):422–426, 1970.
  4. On the false-positive rate of bloom filters. Information Processing Letters, 108(4):210–213, 2008.
  5. A new analysis of the false positive rate of a bloom filter. Information Processing Letters, 110(21):944–949, 2010.
  6. Retouched bloom filters: allowing networked applications to trade off selected false positives against false negatives. In Proceedings of the 2006 ACM CoNEXT conference, pages 1–12, 2006.
  7. Cuckoo filter: Practically better than bloom. In Proceedings of the 10th ACM International on Conference on emerging Networking Experiments and Technologies, pages 75–88, 2014.
  8. Summary cache: a scalable wide-area web cache sharing protocol. IEEE/ACM Transactions on Networking, 8(3):281–293, 2000.
  9. {{\{{FlowRadar}}\}}: A better {{\{{NetFlow}}\}} for data centers. In 13th USENIX symposium on networked systems design and implementation (NSDI 16), pages 311–324, 2016.
  10. Ternary bloom filter replacing counting bloom filter. IEEE Communications Letters, 21(2):278–281, 2016.
  11. Optimizing bloom filter: Challenges, solutions, and comparisons. IEEE Communications Surveys & Tutorials, 21(2):1912–1949, 2018.
  12. Efficient multiset synchronization. IEEE/ACM Transactions on Networking, 25(2):1190–1205, 2017.
  13. Algorithmic nuggets in content delivery. ACM SIGCOMM Computer Communication Review, 45(3):52–66, 2015.
  14. Bfr: A bloom filter-based routing approach for information-centric networks. In 2017 IFIP Networking Conference (IFIP Networking) and Workshops, pages 1–9, 2017.
  15. Biometric template protection based on bloom filters and honey templates. Iet Biometrics, 6(1):19–26, 2017.
  16. Stochastic pre-classification for sdn data plane matching. In 2014 IEEE 22nd International Conference on Network Protocols, pages 596–602. IEEE, 2014.
  17. The deletable bloom filter: a new member of the bloom family. IEEE Communications Letters, 14(6):557–559, 2010.
  18. K. Sidik and J. N. Jonkman. A simple confidence interval for meta-analysis. Statistics in medicine, 21(21):3153–3159, 2002.
  19. Theory and practice of bloom filters for distributed systems. IEEE Communications Surveys & Tutorials, 14(1):131–155, 2012.
  20. J. Trindade and T. Vazão. Hran-a scalable routing protocol for multihop wireless networks using bloom filters. In International Conference on Wired/Wireless Internet Communications, pages 434–445. Springer, 2011.
  21. Various. Wikipedia’s bloom filter description. https://en.wikipedia.org/wiki/Bloom_filter#Bloom1970, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Kahlil Dozier (2 papers)
  2. Loqman Salamatian (4 papers)
  3. Dan Rubenstein (10 papers)

Summary

We haven't generated a summary for this paper yet.