Better space-time-robustness trade-offs for set reconciliation (2404.09607v1)
Abstract: We consider the problem of reconstructing the symmetric difference between similar sets from their representations (sketches) of size linear in the number of differences. Exact solutions to this problem are based on error-correcting coding techniques and suffer from a large decoding time. Existing probabilistic solutions based on Invertible Bloom Lookup Tables (IBLTs) are time-efficient but offer insufficient success guarantees for many applications. Here we propose a tunable trade-off between the two approaches combining the efficiency of IBLTs with exponentially decreasing failure probability. The proof relies on a refined analysis of IBLTs proposed in (Baek Tejs Houen et al. SOSA 2023) which has an independent interest. We also propose a modification of our algorithm that enables telling apart the elements of each set in the symmetric difference.
- Simple set sketching. In Symposium on Simplicity in Algorithms (SOSA), pages 228–241. SIAM, 2023.
- Coding for IBLTs with listing guarantees. In IEEE International Symposium on Information Theory, ISIT 2023, Taipei, Taiwan, June 25-30, 2023, pages 1657–1662. IEEE, 2023. doi:10.1109/ISIT54713.2023.10206563.
- Mahdi Cheraghchi. Coding-theoretic methods for sparse recovery. In 49th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2011, Allerton Park & Retreat Center, Monticello, IL, USA, 28-30 September, 2011, pages 909–916. IEEE, 2011. doi:10.1109/Allerton.2011.6120263.
- Simple codes and sparse recovery with fast decoding. In 2019 IEEE International Symposium on Information Theory (ISIT), pages 156–160. IEEE, 2019.
- Fuzzy extractors: How to generate strong keys from biometrics and other noisy data. SIAM J. Comput., 38(1):97–139, 2008. doi:10.1137/060651380.
- Straggler identification in round-trip data streams via Newton’s identities and invertible Bloom filters. IEEE Transactions on Knowledge and Data Engineering, 23(2):297–306, 2011.
- What’s the difference? Efficient set reconciliation without prior context. In Proceedings of the ACM SIGCOMM 2011 Conference, SIGCOMM ’11, page 218–229, New York, NY, USA, 2011. Association for Computing Machinery. doi:10.1145/2018436.2018462.
- Invertible bloom lookup tables with less memory and randomness. CoRR, abs/2306.07583, 2023. URL: https://doi.org/10.48550/arXiv.2306.07583, arXiv:2306.07583, doi:10.48550/ARXIV.2306.07583.
- Property-preserving hash functions for hamming distance from standard assumptions. In Orr Dunkelman and Stefan Dziembowski, editors, Advances in Cryptology – EUROCRYPT 2022, pages 764–781, Cham, 2022. Springer International Publishing.
- Sumit Ganguly. Counting distinct items over update streams. Theoretical Computer Science, 378(3):211–222, 2007. Algorithms and Computation. doi:10.1016/j.tcs.2007.02.031.
- Deterministic k-set structure. Information Processing Letters, 109(1):27–31, 2008. URL: https://www.sciencedirect.com/science/article/pii/S0020019008002378, doi:https://doi.org/10.1016/j.ipl.2008.08.010.
- Invertible Bloom lookup tables. In 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pages 792–799. IEEE, 2011.
- David Harvey and Joris van der Hoeven. Polynomial multiplication over finite fields in time 𝒪(nlogn)𝒪𝑛𝑛\mathcal{O}(n\log n)caligraphic_O ( italic_n roman_log italic_n ). J. ACM, 69(2):12:1–12:40, 2022. doi:10.1145/3505584.
- Data verification and reconciliation with generalized error-control codes. IEEE Transactions on Information Theory, 49(7):1788–1793, 2003.
- Jeong Han Kim. Poisson cloning model for random graphs. In Proc. ICM, Vol. III, pages 873–898, 2006. URL: https://www.mathunion.org/fileadmin/ICM/Proceedings/ICM2006.3/ICM2006.3.ocr.pdf.
- Convergence of multivariate belief propagation, with applications to cuckoo hashing and load balancing. In Proc. 24th SODA, pages 35–46, 2013. URL: http://dl.acm.org/citation.cfm?id=2627817.2627820.
- The Theory of Error-Correcting Codes. North-holland Publishing Company, 2nd edition, 1978.
- Colin McDiarmid. On the method of bounded differences, page 148–188. London Mathematical Society Lecture Note Series. Cambridge University Press, 1989. doi:10.1017/CBO9781107359949.008.
- Minisketch: an optimized library for BCH-based set reconciliation. https://github.com/sipa/minisketch/. Accessed: 2023-03-28.
- Set reconciliation with nearly optimal communication complexity. IEEE Transactions on Information Theory, 49(9):2213–2218, 2003. doi:10.1109/TIT.2003.815784.
- Simple multi-party set reconciliation. Distributed Computing, 31:441–453, 2018.
- Biff (Bloom filter) codes: Fast error correction for large data sets. In 2012 IEEE International Symposium on Information Theory Proceedings, pages 483–487. IEEE, 2012.
- Invertible Bloom lookup tables with listing guarantees. CoRR, abs/2212.13812, 2022. URL: https://doi.org/10.48550/arXiv.2212.13812, arXiv:2212.13812.
- Michael Molloy. Cores in random hypergraphs and boolean formulas. Random Structures & Algorithms, 27(1):124–135, 2005.
- Thomas Morgan. An exploration of two-party reconciliation problems. PhD thesis, Harvard University, Cambridge, MA, May 2018. URL: https://dash.harvard.edu/handle/1/39947174.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.