Towards Optimal Set-Disjointness and Set-Intersection Data Structures

Authors Tsvi Kopelowitz , Virginia Vassilevska Williams



PDF
Thumbnail PDF

File

LIPIcs.ICALP.2020.74.pdf
  • Filesize: 0.61 MB
  • 16 pages

Document Identifiers

Author Details

Tsvi Kopelowitz
  • Department of Computer Science, Bar-Ilan University, Ramat Gan, Israel
Virginia Vassilevska Williams
  • EECS Department, MIT, Cambridge, MA, USA

Cite AsGet BibTex

Tsvi Kopelowitz and Virginia Vassilevska Williams. Towards Optimal Set-Disjointness and Set-Intersection Data Structures. In 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 168, pp. 74:1-74:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)
https://doi.org/10.4230/LIPIcs.ICALP.2020.74

Abstract

In the online set-disjointness problem the goal is to preprocess a family of sets ℱ, so that given two sets S,S' ∈ ℱ, one can quickly establish whether the two sets are disjoint or not. If N = ∑_{S ∈ ℱ} |S|, then let N^p be the preprocessing time and let N^q be the query time. The most efficient known combinatorial algorithm is a generalization of an algorithm by Cohen and Porat [TCS'10] which has a tradeoff curve of p+q = 2. Kopelowitz, Pettie, and Porat [SODA'16] showed that, based on the 3SUM hypothesis, there is a conditional lower bound curve of p+2q ≥ 2. Thus, the current state-of-the-art exhibits a large gap. The online set-intersection problem is the reporting version of the online set-disjointness problem, and given a query, the goal is to report all of the elements in the intersection. When considering algorithms with N^p preprocessing time and N^q +O(op) query time, where op is the size of the output, the combinatorial algorithm for online set-disjointess can be extended to solve online set-intersection with a tradeoff curve of p+q = 2. Kopelowitz, Pettie, and Porat [SODA'16] showed that, assuming the 3SUM hypothesis, for 0 ≤ q ≤ 2/3 this curve is tight. However, for 2/3 ≤ q < 1 there is no known lower bound. In this paper we close both gaps by showing the following: - For online set-disjointness we design an algorithm whose runtime, assuming ω = 2 (where ω is the exponent in the fastest matrix multiplication algorithm), matches the lower bound curve of Kopelowitz et al., for q ≤ 1/3. We then complement the new algorithm by a matching conditional lower bound for q > 1/3 which is based on a natural hypothesis on the time required to detect a triangle in an unbalanced tripartite graph. Remarkably, even if ω > 2, the algorithm matches the lower bound curve of Kopelowitz et al. for p≥ 1.73688 and q ≤ 0.13156. - For set-intersection, we prove a conditional lower bound that matches the combinatorial upper bound curve for q≥ 1/2 which is based on a hypothesis on the time required to enumerate all triangles in an unbalanced tripartite graph. - Finally, we design algorithms for detecting and enumerating triangles in unbalanced tripartite graphs which match the lower bounds of the corresponding hypotheses, assuming ω = 2.

Subject Classification

ACM Subject Classification
  • Theory of computation → Data structures design and analysis
Keywords
  • Set-disjointness data structures
  • Triangle detection
  • Triangle enumeration
  • Fine-grained complexity
  • Fast matrix multiplication

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. A. Abboud and V. Vassilevska Williams. Popular conjectures imply strong lower bounds for dynamic problems. In 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS, pages 434-443, 2014. Google Scholar
  2. P. Afshani and J. Sindahl Nielsen. Data structure lower bounds for document indexing problems. In 43rd International Colloquium on Automata, Languages, and Programming, ICALP, pages 93:1-93:15, 2016. Google Scholar
  3. N. Alon, R. Yuster, and U. Zwick. Finding and counting given length cycles. Algorithmica, 17:209-223, 1997. Google Scholar
  4. A. Amir, T. M. Chan, M. Lewenstein, and N. Lewenstein. On hardness of jumbled indexing. In Automata, Languages, and Programming - 41st International Colloquium, ICALP, Part I, pages 114-125, 2014. Google Scholar
  5. A. Amir, T. Kopelowitz, A. Levy, S. Pettie, E. Porat, and B. R. Shalom. Mind the gap: Essentially optimal algorithms for online dictionary matching with one gap. In 27th International Symposium on Algorithms and Computation, ISAAC, pages 12:1-12:12, 2016. Google Scholar
  6. R. A. Baeza-Yates. A fast set intersection algorithm for sorted sequences. In Combinatorial Pattern Matching, 15th Annual Symposium, CPM, pages 400-408, 2004. URL: https://doi.org/10.1007/978-3-540-27801-6_30.
  7. J. Barbay and C. Kenyon. Adaptive intersection and t-threshold problems. In Proceedings 13th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA, pages 390-399, 2002. Google Scholar
  8. Philip Bille, Anna Pagh, and Rasmus Pagh. Fast evaluation of union-intersection expressions. In Algorithms and Computation, 18th International Symposium, ISAAC, pages 739-750, 2007. Google Scholar
  9. A. Bjorklund, R. Pagh, V. Vassilevska Williams, and U. Zwick. Listing triangles. In Automata, Languages, and Programming - 41st International Colloquium, ICALP, Part I, pages 223-234, 2014. Google Scholar
  10. T. M. Chan, S. Durocher, K. Green Larsen, J. Morrison, and B. T. Wilkinson. Linear-space data structures for range mode query in arrays. Theory Comput. Syst., 55(4):719-741, 2014. Google Scholar
  11. K. Chatterjee, W. Dvorák, M. Henzinger, and A. Svozil. Algorithms and conditional lower bounds for planning problems. In Proceedings of the Twenty-Eighth International Conference on Automated Planning and Scheduling, ICAPS, pages 56-64. AAAI Press, 2018. Google Scholar
  12. N. Chiba and T. Nishizeki. Arboricity and subgraph listing algorithms. SIAM J. Comput., 14(1):210-223, 1985. Google Scholar
  13. H. Cohen and E. Porat. Fast set intersection and two-patterns matching. Theor. Comput. Sci., 411(40-42):3795-3800, 2010. URL: https://doi.org/10.1016/j.tcs.2010.06.002.
  14. H. Cohen and E. Porat. On the hardness of distance oracle for sparse graph. CoRR, abs/1006.1117, 2010. URL: http://arxiv.org/abs/1006.1117.
  15. D. Coppersmith and S. Winograd. Matrix multiplication via arithmetic progressions. J. Symbolic Computation, 9(3):251-280, 1990. Google Scholar
  16. P. Davoodi, M. H. M. Smid, and F. van Walderveen. Two-dimensional range diameter queries. In LATIN 2012: Theoretical Informatics - 10th Latin American Symposium, pages 219-230, 2012. Google Scholar
  17. E. D. Demaine, A. López-Ortiz, and J. Ian Munro. Adaptive set intersections, unions, and differences. In Proceedings of the Eleventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA, pages 743-752, 2000. URL: http://dl.acm.org/citation.cfm?id=338219.338634.
  18. Lech Duraj, Krzysztof Kleiner, Adam Polak, and Virginia Vassilevska Williams. Equivalences between triangle and range query problems. In Shuchi Chawla, editor, Proceedings of the 2020 ACM-SIAM Symposium on Discrete Algorithms, SODA, pages 30-47. SIAM, 2020. Google Scholar
  19. D. Eppstein, M. T. Goodrich, M. Mitzenmacher, and M. R. Torres. 2-3 Cuckoo filters for faster triangle listing and set intersection. In Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (PODS), pages 247-260, 2017. Google Scholar
  20. J. Fischer, T. Gagie, T. Kopelowitz, M. Lewenstein, V. Mäkinen, L. Salmela, and N. Välimäki. Forbidden patterns. In LATIN 2012: Theoretical Informatics - 10th Latin American Symposium, pages 327-337, 2012. Google Scholar
  21. F. Le Gall and F. Urrutia. Improved rectangular matrix multiplication using powers of the coppersmith-winograd tensor. In Artur Czumaj, editor, Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA, pages 1029-1046, 2018. Google Scholar
  22. I. Goldstein, T. Kopelowitz, M. Lewenstein, and E. Porat. How hard is it to find (honest) witnesses? In 24th Annual European Symposium on Algorithms, ESA, pages 45:1-45:16, 2016. Google Scholar
  23. I. Goldstein, T. Kopelowitz, M. Lewenstein, and E. Porat. Conditional lower bounds for space/time tradeoffs. In Algorithms and Data Structures - 15th International Symposium, WADS, pages 421-436, 2017. Google Scholar
  24. Wing-Kai Hon, Rahul Shah, Sharma V. Thankachan, and Jeffrey Scott Vitter. Space-efficient frameworks for top-k string retrieval. J. ACM, 61(2):9:1-9:36, 2014. Google Scholar
  25. A. Itai and M. Rodeh. Finding a minimum circuit in a graph. SIAM J. Comput., 7(4):413-423, 1978. Google Scholar
  26. Z. Jafargholi and E. Viola. 3sum, 3xor, triangles. Algorithmica, 74(1):326-343, 2016. Google Scholar
  27. T. Kopelowitz and R. Krauthgamer. Color-distance oracles and snippets. In 27th Annual Symposium on Combinatorial Pattern Matching, CPM, pages 24:1-24:10, 2016. Google Scholar
  28. T. Kopelowitz, S. Pettie, and E. Porat. Dynamic set intersection. In Proceedings 14th Int'l Symposium on Algorithms and Data Structures (WADS), pages 470-481, 2015. Google Scholar
  29. T. Kopelowitz, S. Pettie, and E. Porat. Higher lower bounds from the 3SUM conjecture. In Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA, pages 1272-1287, 2016. Google Scholar
  30. F. Le Gall. Faster algorithms for rectangular matrix multiplication. In 53rd Annual IEEE Symposium on Foundations of Computer Science, FOCS, pages 514-523, 2012. Google Scholar
  31. F. Le Gall. Powers of tensors and fast matrix multiplication. In International Symposium on Symbolic and Algebraic Computation, ISSAC '14, Kobe, Japan, July 23-25, 2014, pages 296-303, 2014. Google Scholar
  32. M. Patrascu. Towards polynomial lower bounds for dynamic problems. In Proceedings of the 42nd ACM Symposium on Theory of Computing, STOC, pages 603-610, 2010. Google Scholar
  33. M. Patrascu and L. Roditty. Distance oracles beyond the Thorup-Zwick bound. SIAM J. Comput., 43(1):300-311, 2014. Google Scholar
  34. M. Patrascu, L. Roditty, and M. Thorup. A new infinity of distance oracles for sparse graphs. In 53rd Annual IEEE Symposium on Foundations of Computer Science, FOCS, pages 738-747, 2012. Google Scholar
  35. A. Stothers. On the complexity of matrix multiplication. Ph.D. Thesis, U. Edinburgh, 2010. Google Scholar
  36. V. Vassilevska Williams. Multiplying matrices faster than Coppersmith-Winograd. In Proceedings of the 44th Symposium on Theory of Computing Conference, STOC, pages 887-898, 2012. Google Scholar
  37. V. Vassilevska Williams. On some fine-grained questions in algorithms and complexity. In Proceedings of the International Congress of Mathematicians, pages 3431-3475, 2018. Google Scholar
  38. R. Yuster and U. Zwick. Fast sparse matrix multiplication. ACM Trans. on Algorithms, 1(1):2-13, 2005. Google Scholar