Improved Streaming Algorithm for the Klee’s Measure Problem and Generalizations

Authors: Mridul Nandi, N. V. Vinodchandran, Arijit Ghosh, Kuldeep S. Meel, Soumit Pal, and Sourav Chakraborty

Published in: LIPIcs, Volume 317, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024)

Estimating the size of the union of a stream of sets S₁, S₂, …, S_M where each set is a subset of a known universe Ω is a fundamental problem in data streaming. This problem naturally generalizes the well-studied 𝖥₀ estimation problem in the streaming literature, where each set contains a single element from the universe. We consider the general case when the sets S_i can be succinctly represented and allow efficient membership, cardinality, and sampling queries (called a Delphic family of sets). A notable example in this framework is the Klee’s Measure Problem (KMP), where every set S_i is an axis-parallel rectangle in d-dimensional spaces (Ω = [Δ]^d where [Δ] := {1, … ,Δ} and Δ ∈ ℕ). Recently, Meel, Chakraborty, and Vinodchandran (PODS-21, PODS-22) designed a streaming algorithm for (ε,δ)-estimation of the size of the union of set streams over Delphic family with space and update time complexity O((log³|Ω|)/ε² ⋅ log 1/δ) and Õ((log⁴|Ω|)/ε² ⋅ log 1/(δ)), respectively. This work presents a new, sampling-based algorithm for estimating the size of the union of Delphic sets that has space and update time complexity Õ((log²|Ω|)/ε² ⋅ log 1/(δ)). This improves the space complexity bound by a log|Ω| factor and update time complexity bound by a log² |Ω| factor. A critical question is whether quadratic dependence of log|Ω| on space and update time complexities is necessary. Specifically, can we design a streaming algorithm for estimating the size of the union of sets over Delphic family with space and complexity linear in log|Ω| and update time poly(log|Ω|)? While this appears technically challenging, we show that establishing a lower bound of ω(log|Ω|) with poly(log|Ω|) update time is beyond the reach of current techniques. Specifically, we show that under certain hard-to-prove computational complexity hypothesis, there is a streaming algorithm for the problem with optimal space complexity O(log|Ω|) and update time poly(log(|Ω|)). Thus, establishing a space lower bound of ω(log|Ω|) will lead to break-through complexity class separation results.

Mridul Nandi, N. V. Vinodchandran, Arijit Ghosh, Kuldeep S. Meel, Soumit Pal, and Sourav Chakraborty. Improved Streaming Algorithm for the Klee’s Measure Problem and Generalizations. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 317, pp. 26:1-26:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Track A: Algorithms, Complexity and Games
Towards an Analysis of Quadratic Probing

Authors: William Kuszmaul and Zoe Xi

Published in: LIPIcs, Volume 297, 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)

Since 1968, one of the simplest open questions in the theory of hash tables has been to prove anything nontrivial about the correctness of quadratic probing. We make the first tangible progress towards this goal, showing that there exists a positive-constant load factor at which quadratic probing is a constant-expected-time hash table. Our analysis applies more generally to any fixed-offset open-addressing hash table, and extends to higher load factors in the case where the hash table examines blocks of some size B = ω(1).

William Kuszmaul and Zoe Xi. Towards an Analysis of Quadratic Probing. In 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 297, pp. 103:1-103:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

08391 Group Summary – The Evolution and Dynamics of Research Networks

Authors: Vladimir Batagelj, Bettina Hoser, Claudia Müller, Steffen Staab, and Gerd Stumme

Published in: Dagstuhl Seminar Proceedings, Volume 8391, Social Web Communities (2008)

Existing collaboration and innovation in scientific communities can be enhanced by understanding the underlying patterns und hidden relations. Social network analysis is an appropriate method to reveal such patterns. Nevertheless, research in this area is mainly focused on social networks. One promising approach is to use homophily networks as well. Furthermore, extending the static to a dynamic network model enables to understand existing interdependencies in these networks. A mathematical description of possible analyses is given. Finally, resulting research questions are illustrated and the necessity of an interdisciplinary research approach is pointed out.

Vladimir Batagelj, Bettina Hoser, Claudia Müller, Steffen Staab, and Gerd Stumme. 08391 Group Summary – The Evolution and Dynamics of Research Networks. In Social Web Communities. Dagstuhl Seminar Proceedings, Volume 8391, pp. 1-8, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2008)

08191 Working Group Report – X-graphs of Y-graphs and their Representations

Authors: Vladimir Batagelj, Franz J. Brandenburg, Walter Didimo, Guiseppe Liotta, and Maurizio Patrignani

Published in: Dagstuhl Seminar Proceedings, Volume 8191, Graph Drawing with Applications to Bioinformatics and Social Sciences (2008)

We address graph decomposition problems that help the hybrid visualization of large graphs, where different graphic metaphors (node-link, matrix, etc.) are used in the same picture. We generalize the $X$-graphs of $Y$-graphs model introduced by Brandenburg (Brandenburg, F.J.: Graph clustering I: Cycles of cliques. In Di Battista, G., ed.: Graph Drawing (Proc. GD '97). Volume 1353 of Lecture Notes Comput. Sci., Springer-Verlag (1997) 158--168) to formalize the problem of automatically identifying dense subgraphs ($Y$-graphs, clusters) that are prone to be collapsed and shown with a matricial representation when needed. We show that (planar, $K_5$)-recognition, that is, the problem of identifying $K_5$ subgraphs such that the graph obtained by collapsing them is planar, is NP-hard. On the positive side, we show that it is possible to determine the highest value of $k$ such that $G$ is a (planar,$k$-core)-graph in $O(m + n log(n))$ time.

Vladimir Batagelj, Franz J. Brandenburg, Walter Didimo, Guiseppe Liotta, and Maurizio Patrignani. 08191 Working Group Report – X-graphs of Y-graphs and their Representations. In Graph Drawing with Applications to Bioinformatics and Social Sciences. Dagstuhl Seminar Proceedings, Volume 8191, pp. 1-17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2008)

