Abstract
Given a collection of m sets, each a subset of a universe {1, …, n}, maximum coverage is the problem of choosing k sets whose union has the largest cardinality. A simple greedy algorithm achieves an approximation factor of 1  1 / e ≈ 0.632, which is the best possible polynomialtime approximation unless P = NP.
In the streaming setting, information about the input is revealed gradually, in an online fashion. In the setstreaming model, each set is listed contiguously in the stream. In the more general edgestreaming model, the stream is composed of setelement pairs, denoting membership. The overall goal in the streaming setting is to design algorithms that use sublinear space in the size of the input. An interesting line of research is to design algorithms with space complexity polylogarithmic in the size of the input (i.e., polylogarithmic in both n and m); we call such algorithms lowspace. In the setstreaming model, it is known that 1/2 is the best possible lowspace approximation. In the edgestreaming model, no lowspace algorithm can achieve a nontrivial approximation factor.
We study the problem under the assumption that the order in which the stream arrives is chosen uniformly at random. Our main results are as follows.
 In the randomarrival setstreaming model, we give two new algorithms to show that low space is sufficient to break the 1/2 barrier. The first achieves an approximation factor of 1/2 + c₁ using Õ(k²) space, where c₁ > 0 is a small constant and Õ(⋅) notation suppresses polylogarithmic factors; the second achieves a factor of 1  1 / e  ε  o(1) using Õ(k² ε^{3}) space, where the o(1) term is a function of k. This is essentially the optimal bound, as breaking the 11/e barrier is known to require high space.
 In the randomarrival edgestreaming model, we show for all fixed α > 0 and δ > 0, any algorithm that αapproximates maximum coverage with probability at least 0.9 in the randomarrival edgestreaming model requires Ω(m^{1δ}) space (i.e., high space), even for the special case of k = 1.
BibTeX  Entry
@InProceedings{warneke_et_al:LIPIcs.ESA.2023.102,
author = {Warneke, Rowan and Choudhury, Farhana and Wirth, Anthony},
title = {{Maximum Coverage in RandomArrival Streams}},
booktitle = {31st Annual European Symposium on Algorithms (ESA 2023)},
pages = {102:1102:15},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {9783959772952},
ISSN = {18688969},
year = {2023},
volume = {274},
editor = {G{\o}rtz, Inge Li and FarachColton, Martin and Puglisi, Simon J. and Herman, Grzegorz},
publisher = {Schloss Dagstuhl  LeibnizZentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2023/18755},
URN = {urn:nbn:de:0030drops187559},
doi = {10.4230/LIPIcs.ESA.2023.102},
annote = {Keywords: Maximum Coverage, Streaming Algorithm, Random Arrival, Greedy Algorithm, Communication Complexity}
}
Keywords: 

Maximum Coverage, Streaming Algorithm, Random Arrival, Greedy Algorithm, Communication Complexity 
Collection: 

31st Annual European Symposium on Algorithms (ESA 2023) 
Issue Date: 

2023 
Date of publication: 

30.08.2023 