when quoting this document, please refer to the following
DOI:
URN: urn:nbn:de:0030-drops-124813
URL:

;

Towards Optimal Set-Disjointness and Set-Intersection Data Structures

 pdf-format:

Abstract

In the online set-disjointness problem the goal is to preprocess a family of sets ℱ, so that given two sets S,S' ∈ ℱ, one can quickly establish whether the two sets are disjoint or not. If N = ∑_{S ∈ ℱ} |S|, then let N^p be the preprocessing time and let N^q be the query time. The most efficient known combinatorial algorithm is a generalization of an algorithm by Cohen and Porat [TCS'10] which has a tradeoff curve of p+q = 2. Kopelowitz, Pettie, and Porat [SODA'16] showed that, based on the 3SUM hypothesis, there is a conditional lower bound curve of p+2q ≥ 2. Thus, the current state-of-the-art exhibits a large gap.
The online set-intersection problem is the reporting version of the online set-disjointness problem, and given a query, the goal is to report all of the elements in the intersection. When considering algorithms with N^p preprocessing time and N^q +O(op) query time, where op is the size of the output, the combinatorial algorithm for online set-disjointess can be extended to solve online set-intersection with a tradeoff curve of p+q = 2. Kopelowitz, Pettie, and Porat [SODA'16] showed that, assuming the 3SUM hypothesis, for 0 ≤ q ≤ 2/3 this curve is tight. However, for 2/3 ≤ q < 1 there is no known lower bound.
In this paper we close both gaps by showing the following:
- For online set-disjointness we design an algorithm whose runtime, assuming ω = 2 (where ω is the exponent in the fastest matrix multiplication algorithm), matches the lower bound curve of Kopelowitz et al., for q ≤ 1/3. We then complement the new algorithm by a matching conditional lower bound for q > 1/3 which is based on a natural hypothesis on the time required to detect a triangle in an unbalanced tripartite graph. Remarkably, even if ω > 2, the algorithm matches the lower bound curve of Kopelowitz et al. for p≥ 1.73688 and q ≤ 0.13156.
- For set-intersection, we prove a conditional lower bound that matches the combinatorial upper bound curve for q≥ 1/2 which is based on a hypothesis on the time required to enumerate all triangles in an unbalanced tripartite graph.
- Finally, we design algorithms for detecting and enumerating triangles in unbalanced tripartite graphs which match the lower bounds of the corresponding hypotheses, assuming ω = 2.

BibTeX - Entry

@InProceedings{kopelowitz_et_al:LIPIcs:2020:12481,
author =	{Tsvi Kopelowitz and Virginia Vassilevska Williams},
title =	{{Towards Optimal Set-Disjointness and Set-Intersection Data Structures}},
booktitle =	{47th International Colloquium on Automata, Languages, and Programming (ICALP 2020)},
pages =	{74:1--74:16},
series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN =	{978-3-95977-138-2},
ISSN =	{1868-8969},
year =	{2020},
volume =	{168},
editor =	{Artur Czumaj and Anuj Dawar and Emanuela Merelli},
publisher =	{Schloss Dagstuhl--Leibniz-Zentrum f{\"u}r Informatik},