eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2023-08-30
102:1
102:15
10.4230/LIPIcs.ESA.2023.102
article
Maximum Coverage in Random-Arrival Streams
Warneke, Rowan
1
https://orcid.org/0009-0004-6465-836X
Choudhury, Farhana
1
https://orcid.org/0000-0001-6529-4220
Wirth, Anthony
1
https://orcid.org/0000-0003-3746-6704
School of Computing and Information Systems, The University of Melbourne, Australia
Given a collection of m sets, each a subset of a universe {1, …, n}, maximum coverage is the problem of choosing k sets whose union has the largest cardinality. A simple greedy algorithm achieves an approximation factor of 1 - 1 / e ≈ 0.632, which is the best possible polynomial-time approximation unless P = NP.
In the streaming setting, information about the input is revealed gradually, in an online fashion. In the set-streaming model, each set is listed contiguously in the stream. In the more general edge-streaming model, the stream is composed of set-element pairs, denoting membership. The overall goal in the streaming setting is to design algorithms that use sublinear space in the size of the input. An interesting line of research is to design algorithms with space complexity polylogarithmic in the size of the input (i.e., polylogarithmic in both n and m); we call such algorithms low-space. In the set-streaming model, it is known that 1/2 is the best possible low-space approximation. In the edge-streaming model, no low-space algorithm can achieve a nontrivial approximation factor.
We study the problem under the assumption that the order in which the stream arrives is chosen uniformly at random. Our main results are as follows.
- In the random-arrival set-streaming model, we give two new algorithms to show that low space is sufficient to break the 1/2 barrier. The first achieves an approximation factor of 1/2 + c₁ using Õ(k²) space, where c₁ > 0 is a small constant and Õ(⋅) notation suppresses polylogarithmic factors; the second achieves a factor of 1 - 1 / e - ε - o(1) using Õ(k² ε^{-3}) space, where the o(1) term is a function of k. This is essentially the optimal bound, as breaking the 1-1/e barrier is known to require high space.
- In the random-arrival edge-streaming model, we show for all fixed α > 0 and δ > 0, any algorithm that α-approximates maximum coverage with probability at least 0.9 in the random-arrival edge-streaming model requires Ω(m^{1-δ}) space (i.e., high space), even for the special case of k = 1.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol274-esa2023/LIPIcs.ESA.2023.102/LIPIcs.ESA.2023.102.pdf
Maximum Coverage
Streaming Algorithm
Random Arrival
Greedy Algorithm
Communication Complexity