Document Open Access Logo

A Dynamic Space-Efficient Filter with Constant Time Operations

Authors Ioana O. Bercea, Guy Even



PDF
Thumbnail PDF

File

LIPIcs.SWAT.2020.11.pdf
  • Filesize: 0.54 MB
  • 17 pages

Document Identifiers

Author Details

Ioana O. Bercea
  • Tel Aviv University, Israel
Guy Even
  • Tel Aviv University, Israel

Acknowledgements

We would like to thank Michael Bender, Martin Farach-Colton, and Rob Johnson for introducing us to this topic and for interesting conversations. Many thanks to Tomer Even for helpful and thoughtful remarks.

Cite AsGet BibTex

Ioana O. Bercea and Guy Even. A Dynamic Space-Efficient Filter with Constant Time Operations. In 17th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 162, pp. 11:1-11:17, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2020)
https://doi.org/10.4230/LIPIcs.SWAT.2020.11

Abstract

A dynamic dictionary is a data structure that maintains sets of cardinality at most n from a given universe and supports insertions, deletions, and membership queries. A filter approximates membership queries with a one-sided error that occurs with probability at most ε. The goal is to obtain dynamic filters that are space-efficient (the space is 1+o(1) times the information-theoretic lower bound) and support all operations in constant time with high probability. One approach to designing filters is to reduce to the retrieval problem. When the size of the universe is polynomial in n, this approach yields a space-efficient dynamic filter as long as the error parameter ε satisfies log(1/ε) = ω(log log n). For the case that log(1/ε) = O(log log n), we present the first space-efficient dynamic filter with constant time operations in the worst case (whp). In contrast, the space-efficient dynamic filter of Pagh et al. [Anna Pagh et al., 2005] supports insertions and deletions in amortized expected constant time. Our approach employs the classic reduction of Carter et al. [Carter et al., 1978] on a new type of dictionary construction that supports random multisets.

Subject Classification

ACM Subject Classification
  • Theory of computation → Data structures design and analysis
Keywords
  • Data Structures

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Yuriy Arbitman, Moni Naor, and Gil Segev. De-amortized cuckoo hashing: Provable worst-case performance and experimental results. In International Colloquium on Automata, Languages, and Programming, pages 107-118. Springer, 2009. Google Scholar
  2. Yuriy Arbitman, Moni Naor, and Gil Segev. Backyard cuckoo hashing: Constant worst-case operations with a succinct representation. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science, pages 787-796. IEEE, 2010. Google Scholar
  3. Michael A. Bender, Martin Farach-Colton, Mayank Goswami, Rob Johnson, Samuel McCauley, and Shikha Singh. Bloom filters, adaptivity, and the dictionary problem. CoRR, abs/1711.01616, 2017. URL: http://arxiv.org/abs/1711.01616.
  4. Michael A. Bender, Martin Farach-Colton, Mayank Goswami, Rob Johnson, Samuel McCauley, and Shikha Singh. Bloom filters, adaptivity, and the dictionary problem. In 59th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2018, Paris, France, October 7-9, 2018, pages 182-193, 2018. URL: https://doi.org/10.1109/FOCS.2018.00026.
  5. Michael A. Bender, Martin Farach-Colton, Rob Johnson, Russell Kraner, Bradley C. Kuszmaul, Dzejla Medjedovic, Pablo Montes, Pradeep Shetty, Richard P. Spillane, and Erez Zadok. Don't thrash: How to cache your hash on flash. PVLDB, 5(11):1627-1637, 2012. URL: https://doi.org/10.14778/2350229.2350275.
  6. Ioana O. Bercea and Guy Even. A dynamic dictionary for multisets. work in progress, 2020. Google Scholar
  7. Andrei Broder and Michael Mitzenmacher. Using multiple hash functions to improve ip lookups. In Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No. 01CH37213), volume 3, pages 1454-1463. IEEE, 2001. Google Scholar
  8. Andrej Brodnik and J. Ian Munro. Membership in constant time and almost-minimum space. SIAM Journal on Computing, 28(5):1627-1640, 1999. Google Scholar
  9. Larry Carter, Robert Floyd, John Gill, George Markowsky, and Mark Wegman. Exact and approximate membership testers. In Proceedings of the tenth annual ACM symposium on Theory of computing, pages 59-65. ACM, 1978. Google Scholar
  10. J.G. Cleary. Compact hash tables using bidirectional linear probing. IEEE Transactions on Computers, (9):828-834, 1984. Google Scholar
  11. Ketan Dalal, Luc Devroye, Ebrahim Malalla, and Erin McLeish. Two-way chaining with reassignment. SIAM Journal on Computing, 35(2):327-340, 2005. Google Scholar
  12. Erik D Demaine, Friedhelm Meyer auf der Heide, Rasmus Pagh, and Mihai Pătraşcu. De dictionariis dynamicis pauco spatio utentibus. In Latin American Symposium on Theoretical Informatics, pages 349-361. Springer, 2006. Google Scholar
  13. Martin Dietzfelbinger and Friedhelm Meyer auf der Heide. A new universal class of hash functions and dynamic hashing in real time. In International Colloquium on Automata, Languages, and Programming, pages 6-19. Springer, 1990. Google Scholar
  14. Martin Dietzfelbinger and Rasmus Pagh. Succinct data structures for retrieval and approximate membership. In International Colloquium on Automata, Languages, and Programming, pages 385-396. Springer, 2008. Google Scholar
  15. Martin Dietzfelbinger and Michael Rink. Applications of a splitting trick. In International Colloquium on Automata, Languages, and Programming, pages 354-365. Springer, 2009. Google Scholar
  16. Martin Dietzfelbinger and Christoph Weidling. Balanced allocation and dictionaries with tightly packed constant size bins. Theoretical Computer Science, 380(1-2):47-68, 2007. Google Scholar
  17. Peter Elias. Efficient storage and retrieval by content and address of static files. Journal of the ACM (JACM), 21(2):246-260, 1974. Google Scholar
  18. David Eppstein. Cuckoo filter: Simplification and analysis. In 15th Scandinavian Symposium and Workshops on Algorithm Theory, SWAT 2016, June 22-24, 2016, Reykjavik, Iceland, pages 8:1-8:12, 2016. URL: https://doi.org/10.4230/LIPIcs.SWAT.2016.8.
  19. Bin Fan, David G. Andersen, Michael Kaminsky, and Michael Mitzenmacher. Cuckoo filter: Practically better than Bloom. In CoNEXT, pages 75-88. ACM, 2014. Google Scholar
  20. Robert Mario Fano. On the number of bits required to implement an associative memory. memorandum 61. Computer Structures Group, Project MAC, MIT, Cambridge, Mass., 1971. Google Scholar
  21. Dimitris Fotakis, Rasmus Pagh, Peter Sanders, and Paul Spirakis. Space efficient hash tables with worst case constant access time. Theory of Computing Systems, 38(2):229-248, 2005. Google Scholar
  22. Torben Hagerup. Sorting and searching on the word ram. In Annual Symposium on Theoretical Aspects of Computer Science, pages 366-398. Springer, 1998. Google Scholar
  23. Adam Kirsch and Michael Mitzenmacher. Using a queue to de-amortize cuckoo hashing in hardware. In Proceedings of the Forty-Fifth Annual Allerton Conference on Communication, Control, and Computing, volume 75, 2007. Google Scholar
  24. Donald E Knuth. The art of computer programming, vol. 3: Searching and sorting. Reading MA: Addison-Wisley, 1973. Google Scholar
  25. Shachar Lovett and Ely Porat. A lower bound for dynamic approximate membership data structures. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science, pages 797-804. IEEE, 2010. Google Scholar
  26. Michael Mitzenmacher. Compressed Bloom filters. IEEE/ACM Transactions on Networking (TON), 10(5):604-612, 2002. Google Scholar
  27. Michael Mitzenmacher, Salvatore Pontarelli, and Pedro Reviriego. Adaptive cuckoo filters. In 2018 Proceedings of the Twentieth Workshop on Algorithm Engineering and Experiments (ALENEX), pages 36-47. SIAM, 2018. Google Scholar
  28. Christian Worm Mortensen, Rasmus Pagh, and Mihai Pătraşcu. On dynamic range reporting in one dimension. In Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, pages 104-111. ACM, 2005. Google Scholar
  29. Anna Pagh, Rasmus Pagh, and S. Srinivasa Rao. An optimal Bloom filter replacement. In SODA, pages 823-829. SIAM, 2005. Google Scholar
  30. Rasmus Pagh. Low redundancy in static dictionaries with constant query time. SIAM Journal on Computing, 31(2):353-363, 2001. Google Scholar
  31. Rasmus Pagh, Gil Segev, and Udi Wieder. How to approximate a set without knowing its size in advance. In 2013 IEEE 54th Annual Symposium on Foundations of Computer Science, pages 80-89. IEEE, 2013. Google Scholar
  32. Prashant Pandey, Michael A. Bender, Rob Johnson, and Robert Patro. A general-purpose counting filter: Making every bit count. In Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, May 14-19, 2017, pages 775-787, 2017. URL: https://doi.org/10.1145/3035918.3035963.
  33. Rina Panigrahy. Efficient hashing with lookups in two memory accesses. In Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms, pages 830-839. Society for Industrial and Applied Mathematics, 2005. Google Scholar
  34. Mihai Pătraşcu. Succincter. In 2008 49th Annual IEEE Symposium on Foundations of Computer Science, pages 305-313. IEEE, 2008. Google Scholar
  35. Ely Porat. An optimal Bloom filter replacement based on matrix solving. In International Computer Science Symposium in Russia, pages 263-273. Springer, 2009. Google Scholar
  36. Rajeev Raman and Satti Srinivasa Rao. Succinct dynamic dictionaries and trees. In International Colloquium on Automata, Languages, and Programming, pages 357-368. Springer, 2003. Google Scholar
  37. James Reinders. AVX-512 Instructions. https://software.intel.com/en-us/articles/intel-avx-512-instructions, 2013. URL: https://software.intel.com/en-us/articles/intel-avx-512-instructions.
  38. Alan Siegel. On universal classes of extremely random constant-time hash functions. SIAM Journal on Computing, 33(3):505-543, 2004. Google Scholar
  39. Huacheng Yu. Nearly optimal static las vegas succinct dictionary. arXiv preprint arXiv:1911.01348, 2019. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail