Better Space-Time-Robustness Trade-Offs for Set Reconciliation

Belazzougui, Djamal; Kucherov, Gregory; Walzer, Stefan

doi:10.4230/LIPIcs.ICALP.2024.20

File

Author Details

Djamal Belazzougui

CAPA, DTISI, Centre de Recherche sur l'Information Scientifique et Technique, Algiers, Algeria

Gregory Kucherov

LIGM, CNRS, Université Gustave Eiffel, Marne-la-Vallée, France

Stefan Walzer

Karlsruhe Institute of Technology, Germany

Cite AsGet BibTex

Djamal Belazzougui, Gregory Kucherov, and Stefan Walzer. Better Space-Time-Robustness Trade-Offs for Set Reconciliation. In 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 297, pp. 20:1-20:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)
https://doi.org/10.4230/LIPIcs.ICALP.2024.20

Abstract

We consider the problem of reconstructing the symmetric difference between similar sets from their representations (sketches) of size linear in the number of differences. Exact solutions to this problem are based on error-correcting coding techniques and suffer from a large decoding time. Existing probabilistic solutions based on Invertible Bloom Lookup Tables (IBLTs) are time-efficient but offer insufficient success guarantees for many applications. Here we propose a tunable trade-off between the two approaches combining the efficiency of IBLTs with exponentially decreasing failure probability. The proof relies on a refined analysis of IBLTs proposed in (Bæk Tejs Houen et al. SOSA 2023) which has an independent interest. We also propose a modification of our algorithm that enables telling apart the elements of each set in the symmetric difference.

Subject Classification

ACM Subject Classification

Theory of computation → Design and analysis of algorithms
Theory of computation → Randomness, geometry and discrete structures
Theory of computation → Sketching and sampling
Theory of computation → Error-correcting codes

Keywords

data structures
hashing
set reconciliation
invertible Bloom lookup tables
random hypergraphs
BCH codes

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

Jakob Bæk Tejs Houen, Rasmus Pagh, and Stefan Walzer. Simple set sketching. In Symposium on Simplicity in Algorithms (SOSA), pages 228-241. SIAM, 2023.
Daniella Bar-Lev, Avi Mizrahi, Tuvi Etzion, Ori Rottenstreich, and Eitan Yaakobi. Coding for IBLTs with listing guarantees. In IEEE International Symposium on Information Theory, ISIT 2023, Taipei, Taiwan, June 25-30, 2023, pages 1657-1662. IEEE, 2023. URL: https://doi.org/10.1109/ISIT54713.2023.10206563.
Mahdi Cheraghchi. Coding-theoretic methods for sparse recovery. In 49th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2011, Allerton Park & Retreat Center, Monticello, IL, USA, 28-30 September, 2011, pages 909-916. IEEE, 2011. URL: https://doi.org/10.1109/Allerton.2011.6120263.
Mahdi Cheraghchi and João Ribeiro. Simple codes and sparse recovery with fast decoding. In 2019 IEEE International Symposium on Information Theory (ISIT), pages 156-160. IEEE, 2019.
Yevgeniy Dodis, Rafail Ostrovsky, Leonid Reyzin, and Adam D. Smith. Fuzzy extractors: How to generate strong keys from biometrics and other noisy data. SIAM J. Comput., 38(1):97-139, 2008. URL: https://doi.org/10.1137/060651380.
David Eppstein and Michael T Goodrich. Straggler identification in round-trip data streams via Newton’s identities and invertible Bloom filters. IEEE Transactions on Knowledge and Data Engineering, 23(2):297-306, 2011.
David Eppstein, Michael T. Goodrich, Frank Uyeda, and George Varghese. What’s the difference? Efficient set reconciliation without prior context. In Proceedings of the ACM SIGCOMM 2011 Conference, SIGCOMM '11, pages 218-229, New York, NY, USA, 2011. Association for Computing Machinery. URL: https://doi.org/10.1145/2018436.2018462.
Nils Fleischhacker, Kasper Green Larsen, Maciej Obremski, and Mark Simkin. Invertible Bloom lookup tables with less memory and randomness. CoRR, abs/2306.07583, 2023. https://arxiv.org/abs/2306.07583, URL: https://doi.org/10.48550/arXiv.2306.07583.
Nils Fleischhacker, Kasper Green Larsen, and Mark Simkin. Property-preserving hash functions for Hamming distance from standard assumptions. In Orr Dunkelman and Stefan Dziembowski, editors, Advances in Cryptology - EUROCRYPT 2022, pages 764-781, Cham, 2022. Springer International Publishing.
Sumit Ganguly. Counting distinct items over update streams. Theoretical Computer Science, 378(3):211-222, 2007. Algorithms and Computation. URL: https://doi.org/10.1016/j.tcs.2007.02.031.
Sumit Ganguly and Anirban Majumder. Deterministic k-set structure. Information Processing Letters, 109(1):27-31, 2008. URL: https://doi.org/10.1016/j.ipl.2008.08.010.
Michael T Goodrich and Michael Mitzenmacher. Invertible Bloom lookup tables. In 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pages 792-799. IEEE, 2011.
David Harvey and Joris van der Hoeven. Polynomial multiplication over finite fields in time 𝒪(n log n). J. ACM, 69(2):12:1-12:40, 2022. URL: https://doi.org/10.1145/3505584.
Mark G Karpovsky, Lev B Levitin, and Ari Trachtenberg. Data verification and reconciliation with generalized error-control codes. IEEE Transactions on Information Theory, 49(7):1788-1793, 2003.
Jeong Han Kim. Poisson cloning model for random graphs. In Proc. ICM, Vol. III, pages 873-898, 2006. URL: https://www.mathunion.org/fileadmin/ICM/Proceedings/ICM2006.3/ICM2006.3.ocr.pdf.
M. Leconte, M. Lelarge, and L. Massoulié. Convergence of multivariate belief propagation, with applications to cuckoo hashing and load balancing. In Proc. 24th SODA, pages 35-46, 2013. URL: http://dl.acm.org/citation.cfm?id=2627817.2627820.
F.J. MacWilliams and N.J.A. Sloane. The Theory of Error-Correcting Codes. North-holland Publishing Company, 2nd edition, 1978.
Colin McDiarmid. On the method of bounded differences, pages 148-188. London Mathematical Society Lecture Note Series. Cambridge University Press, 1989. URL: https://doi.org/10.1017/CBO9781107359949.008.
Minisketch: an optimized library for BCH-based set reconciliation. https://github.com/sipa/minisketch/. Accessed: 2023-03-28.
Yaron Minsky, Ari Trachtenberg, and Richard Zippel. Set reconciliation with nearly optimal communication complexity. IEEE Transactions on Information Theory, 49(9):2213-2218, 2003. URL: https://doi.org/10.1109/TIT.2003.815784.
Michael Mitzenmacher and Rasmus Pagh. Simple multi-party set reconciliation. Distributed Computing, 31:441-453, 2018.
Michael Mitzenmacher and George Varghese. Biff (Bloom filter) codes: Fast error correction for large data sets. In 2012 IEEE International Symposium on Information Theory Proceedings, pages 483-487. IEEE, 2012.
Avi Mizrahi, Daniella Bar-Lev, Eitan Yaakobi, and Ori Rottenstreich. Invertible Bloom lookup tables with listing guarantees. CoRR, abs/2212.13812, 2022. https://arxiv.org/abs/2212.13812, URL: https://doi.org/10.48550/arXiv.2212.13812.
Michael Molloy. Cores in random hypergraphs and boolean formulas. Random Structures & Algorithms, 27(1):124-135, 2005.
Thomas Morgan. An exploration of two-party reconciliation problems. PhD thesis, Harvard University, Cambridge, MA, May 2018. URL: https://dash.harvard.edu/handle/1/39947174.

Better Space-Time-Robustness Trade-Offs for Set Reconciliation

Authors Djamal Belazzougui , Gregory Kucherov , Stefan Walzer

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

Better Space-Time-Robustness Trade-Offs for Set Reconciliation

Authors Djamal Belazzougui , Gregory Kucherov , Stefan Walzer

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

Related Versions

References

Thanks for your feedback!

Could not send message