LIPIcs.ICALP.2023.7.pdf
- Filesize: 0.76 MB
- 17 pages
We study the query version of the approximate heavy hitter and quantile problems. In the former problem, the input is a parameter ε and a set P of n points in ℝ^d where each point is assigned a color from a set C, and the goal is to build a structure such that given any geometric range γ, we can efficiently find a list of approximate heavy hitters in γ∩P, i.e., colors that appear at least ε |γ∩P| times in γ∩P, as well as their frequencies with an additive error of ε |γ∩P|. In the latter problem, each point is assigned a weight from a totally ordered universe and the query must output a sequence S of 1+1/ε weights such that the i-th weight in S has approximate rank iε|γ∩P|, meaning, rank iε|γ∩P| up to an additive error of ε|γ∩P|. Previously, optimal results were only known in 1D [Wei and Yi, 2011] but a few sub-optimal methods were available in higher dimensions [Peyman Afshani and Zhewei Wei, 2017; Pankaj K. Agarwal et al., 2012]. We study the problems for two important classes of geometric ranges: 3D halfspace and 3D dominance queries. It is known that many other important queries can be reduced to these two, e.g., 1D interval stabbing or interval containment, 2D three-sided queries, 2D circular as well as 2D k-nearest neighbors queries. We consider the real RAM model of computation where integer registers of size w bits, w = Θ(log n), are also available. For dominance queries, we show optimal solutions for both heavy hitter and quantile problems: using linear space, we can answer both queries in time O(log n + 1/ε). Note that as the output size is 1/ε, after investing the initial O(log n) searching time, our structure takes on average O(1) time to find a heavy hitter or a quantile! For more general halfspace heavy hitter queries, the same optimal query time can be achieved by increasing the space by an extra log_w(1/ε) (resp. log log_w(1/ε)) factor in 3D (resp. 2D). By spending extra log^O(1)(1/ε) factors in both time and space, we can also support quantile queries. We remark that it is hopeless to achieve a similar query bound for dimensions 4 or higher unless significant advances are made in the data structure side of theory of geometric approximations.
Feedback for Dagstuhl Publishing