Improved Algorithms for Maximum Coverage in Dynamic and Random Order Streams

Authors Amit Chakrabarti , Andrew McGregor , Anthony Wirth

Amit Chakrabarti
  • Department of Computer Science, Dartmouth College, Hanover, NH, USA
Andrew McGregor
  • Manning College of Information and Computer Sciences, University of Massachusetts, Amherst, MA, USA
Anthony Wirth
  • School of Computing and Information Systems, The University of Melbourne, Australia
  • School of Computer Science, The University of Sydney, Australia


Amit Chakrabarti, Andrew McGregor, and Anthony Wirth. Improved Algorithms for Maximum Coverage in Dynamic and Random Order Streams. In 32nd Annual European Symposium on Algorithms (ESA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 308, pp. 40:1-40:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


The maximum coverage problem is to select k sets, from a collection of m sets, such that the cardinality of their union, in a universe of size n, is maximized. We consider (1-1/e-ε)-approximation algorithms for this NP-hard problem in three standard data stream models. 1) Dynamic Model. The stream consists of a sequence of sets being inserted and deleted. Our multi-pass algorithm uses ε^{-2} k ⋅ polylog(n,m) space. The best previous result (Assadi and Khanna, SODA 2018) used (n +ε^{-4} k) polylog(n,m) space. While both algorithms use O(ε^{-1} log m) passes, our analysis shows that, when ε ≤ 1/log log m, it is possible to reduce the number of passes by a 1/log log m factor without incurring additional space. 2) Random Order Model. In this model, there are no deletions, and the sets forming the instance are uniformly randomly permuted to form the input stream. We show that a single pass and k polylog(n,m) space suffices for arbitrary small constant ε. The best previous result, by Warneke et al. (ESA 2023), used k² polylog(n,m) space. 3) Insert-Only Model. Lastly, our results, along with numerous previous results, use a sub-sampling technique introduced by McGregor and Vu (ICDT 2017) to sparsify the input instance. We explain how this technique and others used in the paper can be implemented such that the amortized update time of our algorithm is polylogarithmic. This also implies an improvement of the state-of-the-art insert only algorithms in terms of the update time: polylog(m,n) update time suffices, whereas the best previous result by Jaud et al. (SEA 2023) required update time that was linear in k.

ACM Subject Classification
  • Theory of computation → Sketching and sampling
  • Data Stream Computation
  • Maximum Coverage
  • Submodular Maximization


