Maximum Coverage in Random-Arrival Streams

Authors Rowan Warneke , Farhana Choudhury , Anthony Wirth



PDF
Thumbnail PDF

File

LIPIcs.ESA.2023.102.pdf
  • Filesize: 0.8 MB
  • 15 pages

Document Identifiers

Author Details

Rowan Warneke
  • School of Computing and Information Systems, The University of Melbourne, Australia
Farhana Choudhury
  • School of Computing and Information Systems, The University of Melbourne, Australia
Anthony Wirth
  • School of Computing and Information Systems, The University of Melbourne, Australia

Acknowledgements

Thanks to Andrew McGregor for suggesting this as a possible research direction.

Cite As Get BibTex

Rowan Warneke, Farhana Choudhury, and Anthony Wirth. Maximum Coverage in Random-Arrival Streams. In 31st Annual European Symposium on Algorithms (ESA 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 274, pp. 102:1-102:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023) https://doi.org/10.4230/LIPIcs.ESA.2023.102

Abstract

Given a collection of m sets, each a subset of a universe {1, …, n}, maximum coverage is the problem of choosing k sets whose union has the largest cardinality. A simple greedy algorithm achieves an approximation factor of 1 - 1 / e ≈ 0.632, which is the best possible polynomial-time approximation unless P = NP.
In the streaming setting, information about the input is revealed gradually, in an online fashion. In the set-streaming model, each set is listed contiguously in the stream. In the more general edge-streaming model, the stream is composed of set-element pairs, denoting membership. The overall goal in the streaming setting is to design algorithms that use sublinear space in the size of the input. An interesting line of research is to design algorithms with space complexity polylogarithmic in the size of the input (i.e., polylogarithmic in both n and m); we call such algorithms low-space. In the set-streaming model, it is known that 1/2 is the best possible low-space approximation. In the edge-streaming model, no low-space algorithm can achieve a nontrivial approximation factor.
We study the problem under the assumption that the order in which the stream arrives is chosen uniformly at random. Our main results are as follows.  
- In the random-arrival set-streaming model, we give two new algorithms to show that low space is sufficient to break the 1/2 barrier. The first achieves an approximation factor of 1/2 + c₁ using Õ(k²) space, where c₁ > 0 is a small constant and Õ(⋅) notation suppresses polylogarithmic factors; the second achieves a factor of 1 - 1 / e - ε - o(1) using Õ(k² ε^{-3}) space, where the o(1) term is a function of k. This is essentially the optimal bound, as breaking the 1-1/e barrier is known to require high space. 
- In the random-arrival edge-streaming model, we show for all fixed α > 0 and δ > 0, any algorithm that α-approximates maximum coverage with probability at least 0.9 in the random-arrival edge-streaming model requires Ω(m^{1-δ}) space (i.e., high space), even for the special case of k = 1.

Subject Classification

ACM Subject Classification
  • Theory of computation → Random order and robust communication complexity
  • Theory of computation → Sketching and sampling
Keywords
  • Maximum Coverage
  • Streaming Algorithm
  • Random Arrival
  • Greedy Algorithm
  • Communication Complexity

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Shipra Agrawal, Mohammad Shadravan, and Cliff Stein. Submodular Secretary Problem with Shortlists. In 10th ITCS, pages 1:1-1:19, 2019. Google Scholar
  2. Aris Anagnostopoulos, Luca Becchetti, Ilaria Bordino, Stefano Leonardi, Ida Mele, and Piotr Sankowski. Stochastic query covering for fast approximate document retrieval. ACM Transactions on Information Systems, 33(3):1-35, 2015. Google Scholar
  3. Alexandr Andoni, Andrew McGregor, Krzysztof Onak, and Rina Panigrahy. Better bounds for frequency moments in random-order streams, 2008. URL: https://arxiv.org/abs/0808.2222.
  4. Sepehr Assadi and Soheil Behnezhad. Beating two-thirds for random-order streaming matching. In 48th ICALP, page 19, 2021. Google Scholar
  5. Ashwinkumar Badanidiyuru, Baharan Mirzasoleiman, Amin Karbasi, and Andreas Krause. Streaming submodular maximization: Massive data summarization on the fly. In 20th ACM SIGKDD, pages 671-680, 2014. Google Scholar
  6. MohammadHossein Bateni, Hossein Esfandiari, and Vahab Mirrokni. Almost optimal streaming algorithms for coverage problems. In 29th ACM SPAA, pages 13-23, 2017. Google Scholar
  7. Uriel Feige. A threshold of ln n for approximating set cover. J. ACM, 45(4):634-652, 1998. Google Scholar
  8. Moran Feldman, Ashkan Norouzi-Fard, Ola Svensson, and Rico Zenklusen. The one-way communication complexity of submodular maximization with applications to streaming and robustness. In 52nd ACM STOC, pages 1363-1374, 2020. Google Scholar
  9. André Gronemeier. Asymptotically optimal lower bounds on the NIH-multi-party information complexity of the AND-function and disjointness. In 26th STACS, pages 505-516, 2009. Google Scholar
  10. Sudipto Guha and Andrew McGregor. Stream order and order statistics: Quantile estimation in random-order streams. SIAM Journal on Computing, 38(5):2044-2059, 2009. Google Scholar
  11. Piotr Indyk and Ali Vakilian. Tight trade-offs for the maximum k-coverage problem in the general streaming model. In 38th ACM PODS, pages 200-217, 2019. Google Scholar
  12. Stephen Jaud, Anthony Wirth, and Farhana Choudhury. Maximum coverage in sublinear space, faster, 2023. URL: https://arxiv.org/abs/2302.06137.
  13. Christian Konrad, Frédéric Magniez, and Claire Mathieu. Maximum matching in semi-streaming with few passes. In 15th APPROX, pages 231-242, 2012. Google Scholar
  14. Paul Liu, Aviad Rubinstein, Jan Vondrák, and Junyao Zhao. Cardinality constrained submodular maximization for random streams. In 34th NeurIPS, pages 6491-6502, 2021. Google Scholar
  15. Andrew McGregor and Hoa T Vu. Better streaming algorithms for the maximum coverage problem. Theory of Computing Systems, 63:1595-1619, 2019. Google Scholar
  16. Nimrod Megiddo, Eitan Zemel, and S Louis Hakimi. The maximum coverage location problem. SIAM Journal on Algebraic Discrete Methods, 4(2):253-261, 1983. Google Scholar
  17. Ashkan Norouzi-Fard, Jakub Tarnawski, Slobodan Mitrovic, Amir Zandieh, Aidasadat Mousavifar, and Ola Svensson. Beyond 1/2-approximation for submodular maximization on massive data streams. In 35th ICML, pages 3829-3838, 2018. Google Scholar
  18. Barna Saha and Lise Getoor. On maximum coverage in the streaming model & application to multi-topic blog-watch. In 9th SDM, pages 697-708, 2009. Google Scholar
  19. Jeanette P Schmidt, Alan Siegel, and Aravind Srinivasan. Chernoff-Hoeffding bounds for applications with limited independence. SIAM Journal on Discrete Mathematics, 8(2):223-250, 1995. Google Scholar
  20. Huiwen Yu and Dayu Yuan. Set coverage problems in a one-pass data stream. In 13th SDM, pages 758-766, 2013. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail