In-Range Farthest Point Queries and Related Problem in High Dimensions
Range-aggregate query is an important type of queries with numerous applications. It aims to obtain some structural information (defined by an aggregate function F(⋅)) of the points (from a point set P) inside a given query range B. In this paper, we study the range-aggregate query problem in high dimensional space for two aggregate functions: (1) F(P ∩ B) is the farthest point in P ∩ B to a query point q in ℝ^d and (2) F(P ∩ B) is the minimum enclosing ball (MEB) of P ∩ B. For problem (1), called In-Range Farthest Point (IFP) Query, we develop a bi-criteria approximation scheme: For any ε > 0 that specifies the approximation ratio of the farthest distance and any γ > 0 that measures the "fuzziness" of the query range, we show that it is possible to pre-process P into a data structure of size Õ_{ε,γ}(dn^{1+ρ}) in Õ_{ε,γ}(dn^{1+ρ}) time such that given any ℝ^d query ball B and query point q, it outputs in Õ_{ε,γ}(dn^ρ) time a point p that is a (1-ε)-approximation of the farthest point to q among all points lying in a (1+γ)-expansion B(1+γ) of B, where 0 < ρ < 1 is a constant depending on ε and γ and the hidden constants in big-O notations depend only on ε, γ and Polylog(nd). For problem (2), we show that the IFP result can be applied to develop query scheme with similar time and space complexities to achieve a (1+ε)-approximation for MEB. To the best of our knowledge, these are the first theoretical results on such high dimensional range-aggregate query problems. Our results are based on several new techniques, such as multi-scale construction and ball difference range query, which are interesting in their own rights and could be potentially used to solve other range-aggregate problems in high dimensional space.
Farthest Point Query
Range Aggregate Query
Minimum Enclosing Ball
Approximation
High Dimensional Space
Theory of computation~Computational geometry
75:1-75:21
Track A: Algorithms, Complexity and Games
https://arxiv.org/abs/2206.07592
Ziyun
Huang
Ziyun Huang
Department of Computer Science and Software Engineering, Penn State Erie, The Behrend College, USA
Jinhui
Xu
Jinhui Xu
Department of Computer Science and Engineering, State University of New York at Buffalo, NY, USA
The research of this author was supported in part by NSF through grant IIS-1910492 and by KAUST through grant CRG10 4663.2.
10.4230/LIPIcs.ICALP.2022.75
Sofiane Abbar, Sihem Amer-Yahia, Piotr Indyk, Sepideh Mahabadi, and Kasturi R Varadarajan. Diverse near neighbor problem. In Proceedings of the twenty-ninth annual symposium on Computational geometry, pages 207-214, 2013.
Mikkel Abrahamsen, Mark de Berg, Kevin Buchin, Mehran Mehr, and Ali D Mehrabi. Range-clustering queries. arXiv preprint, 2017. URL: http://arxiv.org/abs/1705.06242.
http://arxiv.org/abs/1705.06242
Pankaj K Agarwal, Lars Arge, Sathish Govindarajan, Jun Yang, and Ke Yi. Efficient external memory structures for range-aggregate queries. Computational Geometry, 46(3):358-370, 2013.
Pankaj K Agarwal, Jeff Erickson, et al. Geometric range searching and its relatives. Contemporary Mathematics, 223:1-56, 1999.
Alexandr Andoni, Piotr Indyk, Huy L Nguyen, and Ilya Razenshteyn. Beyond locality-sensitive hashing. In Proceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithms, pages 1018-1028. SIAM, 2014.
Sunil Arya, David M Mount, and Eunhui Park. Approximate geometric mst range queries. In 31st International Symposium on Computational Geometry (SoCG 2015). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2015.
Martin Aumüller, Tobias Christiani, Rasmus Pagh, and Francesco Silvestri. Distance-sensitive hashing. In Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, pages 89-104, 2018.
Martin Aumüller, Sariel Har-Peled, Sepideh Mahabadi, Rasmus Pagh, and Francesco Silvestri. Sampling a near neighbor in high dimensions-who is the fairest of them all? arXiv preprint, 2021. URL: http://arxiv.org/abs/2101.10905.
http://arxiv.org/abs/2101.10905
Mihai Badoiu and Kenneth L Clarkson. Smaller core-sets for balls. In SODA, volume 3, pages 801-802, 2003.
Peter Brass, Christian Knauer, Chan-Su Shin, Michiel Smid, and Ivo Vigan. Range-aggregate queries for geometric extent problems. In Proceedings of the Nineteenth Computing: The Australasian Theory Symposium-Volume 141, pages 3-10, 2013.
Ryan R Curtin and Andrew B Gardner. Fast approximate furthest neighbors with data-dependent candidate selection. In International Conference on Similarity Search and Applications, pages 221-235. Springer, 2016.
Mayur Datar, Nicole Immorlica, Piotr Indyk, and Vahab S Mirrokni. Locality-sensitive hashing scheme based on p-stable distributions. In Proceedings of the twentieth annual symposium on Computational geometry, pages 253-262, 2004.
Prosenjit Gupta, Ravi Janardan, Yokesh Kumar, and Michiel Smid. Data structures for range-aggregate extent queries. Computational Geometry, 47(2):329-347, 2014.
Sariel Har-Peled. Geometric approximation algorithms. Number 173. American Mathematical Soc., 2011.
Sariel Har-Peled, Piotr Indyk, and Rajeev Motwani. Approximate nearest neighbor: Towards removing the curse of dimensionality. Theory of computing, 8(1):321-350, 2012.
Ching-Tien Ho, Rakesh Agrawal, Nimrod Megiddo, and Ramakrishnan Srikant. Range queries in olap data cubes. ACM SIGMOD Record, 26(2):73-88, 1997.
Qiang Huang, Jianlin Feng, and Qiong Fang. Reverse query-aware locality-sensitive hashing for high-dimensional furthest neighbor search. In 2017 IEEE 33rd International Conference on Data Engineering (ICDE), pages 167-170. IEEE, 2017.
Ziyun Huang, Hu Ding, and Jinhui Xu. A faster algorithm for truth discovery via range cover. Algorithmica, 81(10):4118-4133, 2019.
Piotr Indyk. Better algorithms for high-dimensional proximity problems via asymmetric embeddings. In Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms, pages 539-545, 2003.
Sankalp Khare, Jatin Agarwal, Nadeem Moidu, and Kannan Srinathan. Improved bounds for smallest enclosing disk range queries. In CCCG. Citeseer, 2014.
Zhe Li, Tsz Nam Chan, Man Lung Yiu, and Christian S Jensen. Polyfit: Polynomial-based indexing approach for fast approximate range aggregate queries. arXiv preprint, 2020. URL: http://arxiv.org/abs/2003.08031.
http://arxiv.org/abs/2003.08031
Michael Mitzenmacher and Eli Upfal. Probability and computing: Randomization and probabilistic techniques in algorithms and data analysis. Cambridge university press, 2017.
Yakov Nekrich and Michiel HM Smid. Approximating range-aggregate queries using coresets. In CCCG, pages 253-256, 2010.
Rasmus Pagh, Francesco Silvestri, Johan Sivertsen, and Matthew Skala. Approximate furthest neighbor in high dimensions. In International Conference on Similarity Search and Applications, pages 3-14. Springer, 2015.
Saladi Rahul, Haritha Bellam, Prosenjit Gupta, and Krishnan Rajan. Range aggregate structures for colored geometric objects. In CCCG, pages 249-252, 2010.
Saladi Rahul, Ananda Swarup Das, KS Rajan, and Kannan Srinathan. Range-aggregate queries involving geometric aggregation operations. In International Workshop on Algorithms and Computation, pages 122-133. Springer, 2011.
Yufei Tao, Xiaokui Xiao, and Reynold Cheng. Range search on multidimensional uncertain data. ACM Transactions on Database Systems (TODS), 32(3):15-es, 2007.
Jeffrey Scott Vitter and Min Wang. Approximate computation of multidimensional aggregates of sparse data using wavelets. Acm Sigmod Record, 28(2):193-204, 1999.
Eugene Wu and Samuel Madden. Scorpion: Explaining away outliers in aggregate queries. Proceedings of the VLDB Endowment, 6(8), 2013.
Jie Xue. Colored range closest-pair problem under general distance functions. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 373-390. SIAM, 2019.
Jie Xue, Yuan Li, and Ravi Janardan. Approximate range closest-pair queries. Computational Geometry, 90:101654, 2020.
Xiaochun Yun, Guangjun Wu, Guangyan Zhang, Keqin Li, and Shupeng Wang. Fastraq: A fast approach to range-aggregate queries in big data environments. IEEE Transactions on Cloud Computing, 3(2):206-218, 2014.
Ziyun Huang and Jinhui Xu
Creative Commons Attribution 4.0 International license
https://creativecommons.org/licenses/by/4.0/legalcode