Document

# Towards Optimal Set-Disjointness and Set-Intersection Data Structures

## File

LIPIcs.ICALP.2020.74.pdf
• Filesize: 0.61 MB
• 16 pages

## Cite As

Tsvi Kopelowitz and Virginia Vassilevska Williams. Towards Optimal Set-Disjointness and Set-Intersection Data Structures. In 47th International Colloquium on Automata, Languages, and Programming (ICALP 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 168, pp. 74:1-74:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)
https://doi.org/10.4230/LIPIcs.ICALP.2020.74

## Abstract

In the online set-disjointness problem the goal is to preprocess a family of sets ℱ, so that given two sets S,S' ∈ ℱ, one can quickly establish whether the two sets are disjoint or not. If N = ∑_{S ∈ ℱ} |S|, then let N^p be the preprocessing time and let N^q be the query time. The most efficient known combinatorial algorithm is a generalization of an algorithm by Cohen and Porat [TCS'10] which has a tradeoff curve of p+q = 2. Kopelowitz, Pettie, and Porat [SODA'16] showed that, based on the 3SUM hypothesis, there is a conditional lower bound curve of p+2q ≥ 2. Thus, the current state-of-the-art exhibits a large gap. The online set-intersection problem is the reporting version of the online set-disjointness problem, and given a query, the goal is to report all of the elements in the intersection. When considering algorithms with N^p preprocessing time and N^q +O(op) query time, where op is the size of the output, the combinatorial algorithm for online set-disjointess can be extended to solve online set-intersection with a tradeoff curve of p+q = 2. Kopelowitz, Pettie, and Porat [SODA'16] showed that, assuming the 3SUM hypothesis, for 0 ≤ q ≤ 2/3 this curve is tight. However, for 2/3 ≤ q < 1 there is no known lower bound. In this paper we close both gaps by showing the following: - For online set-disjointness we design an algorithm whose runtime, assuming ω = 2 (where ω is the exponent in the fastest matrix multiplication algorithm), matches the lower bound curve of Kopelowitz et al., for q ≤ 1/3. We then complement the new algorithm by a matching conditional lower bound for q > 1/3 which is based on a natural hypothesis on the time required to detect a triangle in an unbalanced tripartite graph. Remarkably, even if ω > 2, the algorithm matches the lower bound curve of Kopelowitz et al. for p≥ 1.73688 and q ≤ 0.13156. - For set-intersection, we prove a conditional lower bound that matches the combinatorial upper bound curve for q≥ 1/2 which is based on a hypothesis on the time required to enumerate all triangles in an unbalanced tripartite graph. - Finally, we design algorithms for detecting and enumerating triangles in unbalanced tripartite graphs which match the lower bounds of the corresponding hypotheses, assuming ω = 2.

## Subject Classification

##### ACM Subject Classification
• Theory of computation → Data structures design and analysis
##### Keywords
• Set-disjointness data structures
• Triangle detection
• Triangle enumeration
• Fine-grained complexity
• Fast matrix multiplication

## Metrics

• Access Statistics
• Total Accesses (updated on a weekly basis)
0

## References

1. A. Abboud and V. Vassilevska Williams. Popular conjectures imply strong lower bounds for dynamic problems. In 55th IEEE Annual Symposium on Foundations of Computer Science, FOCS, pages 434-443, 2014.
2. P. Afshani and J. Sindahl Nielsen. Data structure lower bounds for document indexing problems. In 43rd International Colloquium on Automata, Languages, and Programming, ICALP, pages 93:1-93:15, 2016.
3. N. Alon, R. Yuster, and U. Zwick. Finding and counting given length cycles. Algorithmica, 17:209-223, 1997.
4. A. Amir, T. M. Chan, M. Lewenstein, and N. Lewenstein. On hardness of jumbled indexing. In Automata, Languages, and Programming - 41st International Colloquium, ICALP, Part I, pages 114-125, 2014.
5. A. Amir, T. Kopelowitz, A. Levy, S. Pettie, E. Porat, and B. R. Shalom. Mind the gap: Essentially optimal algorithms for online dictionary matching with one gap. In 27th International Symposium on Algorithms and Computation, ISAAC, pages 12:1-12:12, 2016.
6. R. A. Baeza-Yates. A fast set intersection algorithm for sorted sequences. In Combinatorial Pattern Matching, 15th Annual Symposium, CPM, pages 400-408, 2004. URL: https://doi.org/10.1007/978-3-540-27801-6_30.
7. J. Barbay and C. Kenyon. Adaptive intersection and t-threshold problems. In Proceedings 13th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA, pages 390-399, 2002.
8. Philip Bille, Anna Pagh, and Rasmus Pagh. Fast evaluation of union-intersection expressions. In Algorithms and Computation, 18th International Symposium, ISAAC, pages 739-750, 2007.
9. A. Bjorklund, R. Pagh, V. Vassilevska Williams, and U. Zwick. Listing triangles. In Automata, Languages, and Programming - 41st International Colloquium, ICALP, Part I, pages 223-234, 2014.
10. T. M. Chan, S. Durocher, K. Green Larsen, J. Morrison, and B. T. Wilkinson. Linear-space data structures for range mode query in arrays. Theory Comput. Syst., 55(4):719-741, 2014.
11. K. Chatterjee, W. Dvorák, M. Henzinger, and A. Svozil. Algorithms and conditional lower bounds for planning problems. In Proceedings of the Twenty-Eighth International Conference on Automated Planning and Scheduling, ICAPS, pages 56-64. AAAI Press, 2018.
12. N. Chiba and T. Nishizeki. Arboricity and subgraph listing algorithms. SIAM J. Comput., 14(1):210-223, 1985.
13. H. Cohen and E. Porat. Fast set intersection and two-patterns matching. Theor. Comput. Sci., 411(40-42):3795-3800, 2010. URL: https://doi.org/10.1016/j.tcs.2010.06.002.
14. H. Cohen and E. Porat. On the hardness of distance oracle for sparse graph. CoRR, abs/1006.1117, 2010. URL: http://arxiv.org/abs/1006.1117.
15. D. Coppersmith and S. Winograd. Matrix multiplication via arithmetic progressions. J. Symbolic Computation, 9(3):251-280, 1990.
16. P. Davoodi, M. H. M. Smid, and F. van Walderveen. Two-dimensional range diameter queries. In LATIN 2012: Theoretical Informatics - 10th Latin American Symposium, pages 219-230, 2012.
17. E. D. Demaine, A. López-Ortiz, and J. Ian Munro. Adaptive set intersections, unions, and differences. In Proceedings of the Eleventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA, pages 743-752, 2000. URL: http://dl.acm.org/citation.cfm?id=338219.338634.
18. Lech Duraj, Krzysztof Kleiner, Adam Polak, and Virginia Vassilevska Williams. Equivalences between triangle and range query problems. In Shuchi Chawla, editor, Proceedings of the 2020 ACM-SIAM Symposium on Discrete Algorithms, SODA, pages 30-47. SIAM, 2020.
19. D. Eppstein, M. T. Goodrich, M. Mitzenmacher, and M. R. Torres. 2-3 Cuckoo filters for faster triangle listing and set intersection. In Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (PODS), pages 247-260, 2017.
20. J. Fischer, T. Gagie, T. Kopelowitz, M. Lewenstein, V. Mäkinen, L. Salmela, and N. Välimäki. Forbidden patterns. In LATIN 2012: Theoretical Informatics - 10th Latin American Symposium, pages 327-337, 2012.
21. F. Le Gall and F. Urrutia. Improved rectangular matrix multiplication using powers of the coppersmith-winograd tensor. In Artur Czumaj, editor, Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA, pages 1029-1046, 2018.
22. I. Goldstein, T. Kopelowitz, M. Lewenstein, and E. Porat. How hard is it to find (honest) witnesses? In 24th Annual European Symposium on Algorithms, ESA, pages 45:1-45:16, 2016.
23. I. Goldstein, T. Kopelowitz, M. Lewenstein, and E. Porat. Conditional lower bounds for space/time tradeoffs. In Algorithms and Data Structures - 15th International Symposium, WADS, pages 421-436, 2017.
24. Wing-Kai Hon, Rahul Shah, Sharma V. Thankachan, and Jeffrey Scott Vitter. Space-efficient frameworks for top-k string retrieval. J. ACM, 61(2):9:1-9:36, 2014.
25. A. Itai and M. Rodeh. Finding a minimum circuit in a graph. SIAM J. Comput., 7(4):413-423, 1978.
26. Z. Jafargholi and E. Viola. 3sum, 3xor, triangles. Algorithmica, 74(1):326-343, 2016.
27. T. Kopelowitz and R. Krauthgamer. Color-distance oracles and snippets. In 27th Annual Symposium on Combinatorial Pattern Matching, CPM, pages 24:1-24:10, 2016.
28. T. Kopelowitz, S. Pettie, and E. Porat. Dynamic set intersection. In Proceedings 14th Int'l Symposium on Algorithms and Data Structures (WADS), pages 470-481, 2015.
29. T. Kopelowitz, S. Pettie, and E. Porat. Higher lower bounds from the 3SUM conjecture. In Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA, pages 1272-1287, 2016.
30. F. Le Gall. Faster algorithms for rectangular matrix multiplication. In 53rd Annual IEEE Symposium on Foundations of Computer Science, FOCS, pages 514-523, 2012.
31. F. Le Gall. Powers of tensors and fast matrix multiplication. In International Symposium on Symbolic and Algebraic Computation, ISSAC '14, Kobe, Japan, July 23-25, 2014, pages 296-303, 2014.
32. M. Patrascu. Towards polynomial lower bounds for dynamic problems. In Proceedings of the 42nd ACM Symposium on Theory of Computing, STOC, pages 603-610, 2010.
33. M. Patrascu and L. Roditty. Distance oracles beyond the Thorup-Zwick bound. SIAM J. Comput., 43(1):300-311, 2014.
34. M. Patrascu, L. Roditty, and M. Thorup. A new infinity of distance oracles for sparse graphs. In 53rd Annual IEEE Symposium on Foundations of Computer Science, FOCS, pages 738-747, 2012.
35. A. Stothers. On the complexity of matrix multiplication. Ph.D. Thesis, U. Edinburgh, 2010.
36. V. Vassilevska Williams. Multiplying matrices faster than Coppersmith-Winograd. In Proceedings of the 44th Symposium on Theory of Computing Conference, STOC, pages 887-898, 2012.
37. V. Vassilevska Williams. On some fine-grained questions in algorithms and complexity. In Proceedings of the International Congress of Mathematicians, pages 3431-3475, 2018.
38. R. Yuster and U. Zwick. Fast sparse matrix multiplication. ACM Trans. on Algorithms, 1(1):2-13, 2005.
X

Feedback for Dagstuhl Publishing