Dynamic Elias-Fano Representation

Authors Giulio Ermanno Pibiri, Rossano Venturini



PDF
Thumbnail PDF

File

LIPIcs.CPM.2017.30.pdf
  • Filesize: 0.53 MB
  • 14 pages

Document Identifiers

Author Details

Giulio Ermanno Pibiri
Rossano Venturini

Cite As Get BibTex

Giulio Ermanno Pibiri and Rossano Venturini. Dynamic Elias-Fano Representation. In 28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 78, pp. 30:1-30:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017) https://doi.org/10.4230/LIPIcs.CPM.2017.30

Abstract

We show that it is possible to store a dynamic ordered set S of n integers drawn from a bounded universe of size u in space close to the information-theoretic lower bound and preserve, at the same time, the asymptotic time optimality of the operations. Our results leverage on the Elias-Fano representation of monotone integer sequences, which can be shown to be less than half a bit per element away from the information-theoretic minimum.
In particular, considering a RAM model with memory word size Theta(log u) bits, when integers are drawn from a polynomial universe of size u = n^gamma for any gamma = Theta(1), we add o(n) bits to the static Elias-Fano representation in order to:
1. support static predecessor/successor queries in O(min{1+log(u/n), loglog n});
2. make S grow in an append-only fashion by spending O(1) per inserted element;
3. describe a dynamic data structure supporting random access in O(log n / loglog n) worst-case, insertions/deletions in O(log n / loglog n) amortized and predecessor/successor queries in O(min{1+log(u/n), loglog n}) worst-case time. These time bounds are optimal.

Subject Classification

Keywords
  • succinct data structures
  • integer sets
  • predecessor problem
  • Elias-Fano

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Miklós Ajtai. A lower bound for finding predecessors in Yao’s cell probe model. Combinatorica, 8(3):235-247, 1988. URL: http://dx.doi.org/10.1007/BF02126797.
  2. Arne Andersson and Mikkel Thorup. Dynamic ordered sets with exponential search trees. J. ACM, 54(3):13, 2007. URL: http://dx.doi.org/10.1145/1236457.1236460.
  3. Paul Beame and Faith E. Fich. Optimal bounds for the predecessor problem. In Jeffrey Scott Vitter, Lawrence L. Larmore, and Frank Thomson Leighton, editors, Proceedings of the 31st Annual ACM Symposium on Theory of Computing (STOC 1999), pages 295-304. ACM, 1999. URL: http://dx.doi.org/10.1145/301250.301323.
  4. Paul Beame and Faith E. Fich. Optimal bounds for the predecessor problem and related problems. J. Comput. Syst. Sci., 65(1):38-72, 2002. URL: http://dx.doi.org/10.1006/jcss.2002.1822.
  5. Philip Bille, Patrick Hagge Cording, Inge Li Gørtz, Frederik Rye Skjoldjensen, Hjalte Wedel Vildhøj, and Søren Vind. Dynamic relative compression, 2015. URL: http://arxiv.org/abs/1504.07851.
  6. Michael Busch, Krishna Gade, Brian Larson, Patrick Lok, Samuel Luckenbill, and Jimmy Lin. Earlybird: Real-time search at Twitter. In Anastasios Kementsietsidis and Marcos Antonio Vaz Salles, editors, Proceedings of the 28th IEEE International Conference on Data Engineering (ICDE 2012), pages 1360-1369. IEEE Computer Society, 2012. URL: http://dx.doi.org/10.1109/ICDE.2012.149.
  7. David Clark. Compact Pat Trees. PhD thesis, University of Waterloo, Canada, 1996. URL: http://hdl.handle.net/10012/64.
  8. Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms. MIT Press, 3rd edition, 2009. URL: http://mitpress.mit.edu/books/introduction-algorithms.
  9. Erik D. Demaine and Mihai Pǎtraşcu. Tight bounds for the partial-sums problem. In J. Ian Munro, editor, Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2004), pages 20-29. SIAM, 2004. URL: http://dl.acm.org/citation.cfm?id=982792.982796.
  10. Peter Elias. Efficient storage and retrieval by content and address of static files. J. ACM, 21(2):246-260, 1974. URL: http://dx.doi.org/10.1145/321812.321820.
  11. Robert Mario Fano. On the number of bits required to implement an associative memory. Technical Report Memorandum 61, Computer Structures Group, MIT, Cambridge, MA, 1971. URL: http://csg.csail.mit.edu/pubs/memos/Memo-61/Memo-61.pdf.
  12. Michael L. Fredman, János Komlós, and Endre Szemerédi. Storing a sparse table with O(1) worst-case access time. J. ACM, 31(3):538-544, 1984. URL: http://dx.doi.org/10.1145/828.1884.
  13. Michael L. Fredman and Michael E. Saks. The cell probe complexity of dynamic data structures. In David S. Johnson, editor, Proceedings of the 21st Annual ACM Symposium on Theory of Computing (STOC 1989), pages 345-354. ACM, 1989. URL: http://dx.doi.org/10.1145/73007.73040.
  14. Michael L. Fredman and Dan E. Willard. Surpassing the information theoretic bound with fusion trees. J. Comput. Syst. Sci., 47(3):424-436, 1993. URL: http://dx.doi.org/10.1016/0022-0000(93)90040-4.
  15. Roberto Grossi, Rajeev Raman, Srinivasa Rao Satti, and Rossano Venturini. Dynamic compressed strings with random access. In Fedor V. Fomin, Rusins Freivalds, Marta Z. Kwiatkowska, and David Peleg, editors, Proceedings of the 40th International Colloquium on Automata, Languages, and Programming (ICALP 2013), volume 7965 of LNCS, pages 504-515. Springer, 2013. URL: http://dx.doi.org/10.1007/978-3-642-39206-1_43.
  16. Ankur Gupta, Wing-Kai Hon, Rahul Shah, and Jeffrey Scott Vitter. Compressed data structures: Dictionaries and data-aware measures. Theor. Comput. Sci., 387(3):313-331, 2007. URL: http://dx.doi.org/10.1016/j.tcs.2007.07.042.
  17. Jesper Jansson, Kunihiko Sadakane, and Wing-Kin Sung. CRAM: Compressed random access memory. In Artur Czumaj, Kurt Mehlhorn, Andrew M. Pitts, and Roger Wattenhofer, editors, Proceedings of the 39th International Colloquium on Automata, Languages, and Programming (ICALP 2012), volume 7391 of LNCS, pages 510-521. Springer, 2012. URL: http://dx.doi.org/10.1007/978-3-642-31594-7_43.
  18. Veli Mäkinen and Gonzalo Navarro. Rank and select revisited and extended. Theor. Comput. Sci., 387(3):332-347, 2007. URL: http://dx.doi.org/10.1016/j.tcs.2007.07.013.
  19. Gonzalo Navarro and Yakov Nekrich. Optimal dynamic sequence representations. In Sanjeev Khanna, editor, Proceedings of the 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2013), pages 865-876. SIAM, 2013. URL: http://dx.doi.org/10.1137/1.9781611973105.62.
  20. Giuseppe Ottaviano and Rossano Venturini. Partitioned Elias-Fano indexes. In Shlomo Geva, Andrew Trotman, Peter Bruza, Charles L. A. Clarke, and Kalervo Järvelin, editors, Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2014), pages 273-282. ACM, 2014. URL: http://dx.doi.org/10.1145/2600428.2609615.
  21. Mihai Pǎtraşcu and Mikkel Thorup. Time-space trade-offs for predecessor search. In Jon M. Kleinberg, editor, Proceedings of the 38th Annual ACM Symposium on Theory of Computing (STOC 2006), pages 232-240. ACM, 2006. URL: http://dx.doi.org/10.1145/1132516.1132551.
  22. Mihai Pǎtraşcu and Mikkel Thorup. Randomization does not help searching predecessors. In Nikhil Bansal, Kirk Pruhs, and Clifford Stein, editors, Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2007), pages 555-564. SIAM, 2007. URL: http://dl.acm.org/citation.cfm?id=1283383.1283443.
  23. Mihai Pǎtraşcu and Mikkel Thorup. Dynamic integer sets with optimal rank, select, and predecessor search. In Boaz Barak, editor, Proceedings of the 55th IEEE Annual Symposium on Foundations of Computer Science (FOCS 2014), pages 166-175. IEEE Computer Society, 2014. URL: http://dx.doi.org/10.1109/FOCS.2014.26.
  24. Rajeev Raman, Venkatesh Raman, and Srinivasa Rao Satti. Succinct dynamic data structures. In Frank K. H. A. Dehne, Jörg-Rüdiger Sack, and Roberto Tamassia, editors, Proceedings of the 7th International Workshop on Algorithms and Data Structures (WADS 2001), volume 2125 of LNCS, pages 426-437. Springer, 2001. URL: http://dx.doi.org/10.1007/3-540-44634-6_39.
  25. Kunihiko Sadakane and Roberto Grossi. Squeezing succinct data structures into entropy bounds. In Clifford Stein, editor, Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2006), pages 1230-1239. SIAM, 2006. URL: http://dx.doi.org/10.1145/1109557.1109693.
  26. Peter van Emde Boas. Preserving order in a forest in less than logarithmic time. In Daniel J. Rosenkrantz, editor, Proceedings of the 16th Annual Symposium on Foundations of Computer Science (FOCS 1975), pages 75-84. IEEE Computer Society, 1975. URL: http://dx.doi.org/10.1109/SFCS.1975.26.
  27. Peter van Emde Boas. Preserving order in a forest in less than logarithmic time and linear space. Inf. Process. Lett., 6(3):80-82, 1977. URL: http://dx.doi.org/10.1016/0020-0190(77)90031-X.
  28. Peter van Emde Boas, Robert Kaas, and Erik Zijlstra. Design and implementation of an efficient priority queue. Math. Syst. Theory, 10:99-127, 1977. URL: http://dx.doi.org/10.1007/BF01683268.
  29. Sebastiano Vigna. Quasi-succinct indices. In Stefano Leonardi, Alessandro Panconesi, Paolo Ferragina, and Aristides Gionis, editors, Proceedings of the 6th ACM International Conference on Web Search and Data Mining (WSDM 2013), pages 83-92. ACM, 2013. URL: http://dx.doi.org/10.1145/2433396.2433409.
  30. Dan E. Willard. Log-logarithmic worst-case range queries are possible in space Θ(N). Inf. Process. Lett., 17(2):81-84, 1983. URL: http://dx.doi.org/10.1016/0020-0190(83)90075-3.
  31. Andrew Chi-Chih Yao. Should tables be sorted? J. ACM, 28(3):615-628, 1981. URL: http://dx.doi.org/10.1145/322261.322274.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail