Document

**Published in:** Dagstuhl Reports, Volume 13, Issue 8 (2024)

This report documents the program and the outcomes of Dagstuhl Seminar 23342 "Computational Geometry of Earth System Analysis". This seminar brought together experts of algorithms and the Earth sciences to foster collaborations that can tackle algorithmic problems in the Earth system by the crossover of expertise in these different areas. The Earth sciences include a manifold of disciplines that deal with atmospheric, oceanic and terrestrial observations to further our understanding of climate processes. New generations of observation systems that are being developed right now provide novel data about the atmospheric and surface conditions at increasing spatial and temporal resolution. This provides unique information to improve weather and climate prediction but cannot always be handled by traditional numerical models. Computational Geometry is rooted in a strong tradition of algorithm and complexity analysis applied to practical geometric problems. Efficient algorithmic methods developed in this field are often tailored to the low-dimensional geometric settings that arise in a multitude of application areas, but have until recently not been applied to problems arising in the Earth system sciences - and in particular not in meteorology.

Susanne Crewell, Anne Driemel, Jeff M. Phillips, and Dwaipayan Chatterjee. Computational Geometry of Earth System Analysis (Dagstuhl Seminar 23342). In Dagstuhl Reports, Volume 13, Issue 8, pp. 91-105, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@Article{crewell_et_al:DagRep.13.8.91, author = {Crewell, Susanne and Driemel, Anne and Phillips, Jeff M. and Chatterjee, Dwaipayan}, title = {{Computational Geometry of Earth System Analysis (Dagstuhl Seminar 23342)}}, pages = {91--105}, journal = {Dagstuhl Reports}, ISSN = {2192-5283}, year = {2024}, volume = {13}, number = {8}, editor = {Crewell, Susanne and Driemel, Anne and Phillips, Jeff M. and Chatterjee, Dwaipayan}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/DagRep.13.8.91}, URN = {urn:nbn:de:0030-drops-198147}, doi = {10.4230/DagRep.13.8.91}, annote = {Keywords: Data reduction, Event detection, Feature tracking, Geometric algorithms, Interpolation methods, Sensor placement} }

Document

**Published in:** Dagstuhl Reports, Volume 13, Issue 5 (2023)

This report documents the program and the outcomes of the Dagstuhl Seminar 23221 "Computational Geometry". The seminar was held from May 29th to June 2nd, 2023, and 39 participants from various countries attended it, including two remote participants. Recent advances in computational geometry were presented and discussed, and new challenges were identified. This report collects the abstracts of the talks and the open problems presented at the seminar.

Siu-Wing Cheng, Maarten Löffler, Jeff M. Phillips, and Aleksandr Popov. Computational Geometry (Dagstuhl Seminar 23221). In Dagstuhl Reports, Volume 13, Issue 5, pp. 165-181, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@Article{cheng_et_al:DagRep.13.5.165, author = {Cheng, Siu-Wing and L\"{o}ffler, Maarten and Phillips, Jeff M. and Popov, Aleksandr}, title = {{Computational Geometry (Dagstuhl Seminar 23221)}}, pages = {165--181}, journal = {Dagstuhl Reports}, ISSN = {2192-5283}, year = {2023}, volume = {13}, number = {5}, editor = {Cheng, Siu-Wing and L\"{o}ffler, Maarten and Phillips, Jeff M. and Popov, Aleksandr}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/DagRep.13.5.165}, URN = {urn:nbn:de:0030-drops-193692}, doi = {10.4230/DagRep.13.5.165}, annote = {Keywords: Algorithms, Combinatorics, Geometric Computing, Reconfiguration, Uncertainty} }

Document

**Published in:** LIPIcs, Volume 212, 32nd International Symposium on Algorithms and Computation (ISAAC 2021)

Consider the geometric range space (X, H_d) where X ⊂ ℝ^d and H_d is the set of ranges defined by d-dimensional halfspaces. In this setting we consider that X is the disjoint union of a red and blue set. For each halfspace h ∈ H_d define a function Φ(h) that measures the "difference" between the fraction of red and fraction of blue points which fall in the range h. In this context the maximum discrepancy problem is to find the h^* = arg max_{h ∈ (X, H_d)} Φ(h). We aim to instead find an ĥ such that Φ(h^*) - Φ(ĥ) ≤ ε. This is the central problem in linear classification for machine learning, in spatial scan statistics for spatial anomaly detection, and shows up in many other areas. We provide a solution for this problem in O(|X| + (1/ε^d) log⁴ (1/ε)) time, for constant d, which improves polynomially over the previous best solutions. For d = 2 we show that this is nearly tight through conditional lower bounds. For different classes of Φ we can either provide a Ω(|X|^{3/2 - o(1)}) time lower bound for the exact solution with a reduction to APSP, or an Ω(|X| + 1/ε^{2-o(1)}) lower bound for the approximate solution with a reduction to 3Sum.
A key technical result is a ε-approximate halfspace range counting data structure of size O(1/ε^d) with O(log (1/ε)) query time, which we can build in O(|X| + (1/ε^d) log⁴ (1/ε)) time.

Michael Matheny and Jeff M. Phillips. Approximate Maximum Halfspace Discrepancy. In 32nd International Symposium on Algorithms and Computation (ISAAC 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 212, pp. 4:1-4:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{matheny_et_al:LIPIcs.ISAAC.2021.4, author = {Matheny, Michael and Phillips, Jeff M.}, title = {{Approximate Maximum Halfspace Discrepancy}}, booktitle = {32nd International Symposium on Algorithms and Computation (ISAAC 2021)}, pages = {4:1--4:15}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-214-3}, ISSN = {1868-8969}, year = {2021}, volume = {212}, editor = {Ahn, Hee-Kap and Sadakane, Kunihiko}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2021.4}, URN = {urn:nbn:de:0030-drops-154377}, doi = {10.4230/LIPIcs.ISAAC.2021.4}, annote = {Keywords: range spaces, halfspaces, scan statistics, fine-grained complexity} }

Document

**Published in:** Dagstuhl Reports, Volume 11, Issue 4 (2021)

This report documents the program and the outcomes of Dagstuhl Seminar 21181 "Computational Geometry". The seminar was held from May 2 to May 7, 2021. Because of COVID, the seminar was held online in a virtual manner, and 36 participants from various countries attended it. New advances and directions in computational geometry were presented and discussed. The report collects the abstracts of talks and open problems presented in the seminar.

Siu-Wing Cheng, Anne Driemel, and Jeff M. Phillips. Computational Geometry (Dagstuhl Seminar 21181). In Dagstuhl Reports, Volume 11, Issue 4, pp. 1-19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@Article{cheng_et_al:DagRep.11.4.1, author = {Cheng, Siu-Wing and Driemel, Anne and Phillips, Jeff M.}, title = {{Computational Geometry (Dagstuhl Seminar 21181)}}, pages = {1--19}, journal = {Dagstuhl Reports}, ISSN = {2192-5283}, year = {2021}, volume = {11}, number = {4}, editor = {Cheng, Siu-Wing and Driemel, Anne and Phillips, Jeff M.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/DagRep.11.4.1}, URN = {urn:nbn:de:0030-drops-147963}, doi = {10.4230/DagRep.11.4.1}, annote = {Keywords: algorithms, computational geometry, Computational topology, data structures, Discrete geometry} }

Document

**Published in:** LIPIcs, Volume 204, 29th Annual European Symposium on Algorithms (ESA 2021)

Given points P = {p₁,...,p_n} subset of ℝ^d, how do we find a point x which approximately maximizes the function 1/n ∑_{p_i ∈ P} e^{-‖p_i-x‖²}? In other words, how do we find an approximate mode of a Gaussian kernel density estimate (KDE) of P? Given the power of KDEs in representing probability distributions and other continuous functions, the basic mode finding problem is widely applicable. However, it is poorly understood algorithmically. We provide fast and provably accurate approximation algorithms for mode finding in both the low and high dimensional settings. For low (constant) dimension, our main contribution is a reduction to solving systems of polynomial inequalities. For high dimension, we prove the first dimensionality reduction result for KDE mode finding. The latter result leverages Johnson-Lindenstrauss projection, Kirszbraun’s classic extension theorem, and perhaps surprisingly, the mean-shift heuristic for mode finding. For constant approximation factor these algorithms run in O(n (log n)^{O(d)}) and O(nd + (log n)^{O(log³ n)}), respectively; these are proven more precisely as a (1+ε)-approximation guarantee. Furthermore, for the special case of d = 2, we give a combinatorial algorithm running in O(n log² n) time. We empirically demonstrate that the random projection approach and the 2-dimensional algorithm improves over the state-of-the-art mode-finding heuristics.

Jasper C.H. Lee, Jerry Li, Christopher Musco, Jeff M. Phillips, and Wai Ming Tai. Finding an Approximate Mode of a Kernel Density Estimate. In 29th Annual European Symposium on Algorithms (ESA 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 204, pp. 61:1-61:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{lee_et_al:LIPIcs.ESA.2021.61, author = {Lee, Jasper C.H. and Li, Jerry and Musco, Christopher and Phillips, Jeff M. and Tai, Wai Ming}, title = {{Finding an Approximate Mode of a Kernel Density Estimate}}, booktitle = {29th Annual European Symposium on Algorithms (ESA 2021)}, pages = {61:1--61:19}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-204-4}, ISSN = {1868-8969}, year = {2021}, volume = {204}, editor = {Mutzel, Petra and Pagh, Rasmus and Herman, Grzegorz}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2021.61}, URN = {urn:nbn:de:0030-drops-146428}, doi = {10.4230/LIPIcs.ESA.2021.61}, annote = {Keywords: Kernel density estimation, Dimensionality reduction, Coresets, Means-shift} }

Document

RANDOM

**Published in:** LIPIcs, Volume 176, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020)

We introduce two versions of a new sketch for approximately embedding the Gaussian kernel into Euclidean inner product space. These work by truncating infinite expansions of the Gaussian kernel, and carefully invoking the RecursiveTensorSketch [Ahle et al. SODA 2020]. After providing concentration and approximation properties of these sketches, we use them to approximate the kernel distance between points sets. These sketches yield almost (1+ε)-relative error, but with a small additive α term. In the first variants the dependence on 1/α is poly-logarithmic, but has higher degree of polynomial dependence on the original dimension d. In the second variant, the dependence on 1/α is still poly-logarithmic, but the dependence on d is linear.

Jeff M. Phillips and Wai Ming Tai. The GaussianSketch for Almost Relative Error Kernel Distance. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 176, pp. 12:1-12:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{phillips_et_al:LIPIcs.APPROX/RANDOM.2020.12, author = {Phillips, Jeff M. and Tai, Wai Ming}, title = {{The GaussianSketch for Almost Relative Error Kernel Distance}}, booktitle = {Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020)}, pages = {12:1--12:20}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-164-1}, ISSN = {1868-8969}, year = {2020}, volume = {176}, editor = {Byrka, Jaros{\l}aw and Meka, Raghu}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2020.12}, URN = {urn:nbn:de:0030-drops-126150}, doi = {10.4230/LIPIcs.APPROX/RANDOM.2020.12}, annote = {Keywords: Kernel Distance, Kernel Density Estimation, Sketching} }

Document

**Published in:** LIPIcs, Volume 164, 36th International Symposium on Computational Geometry (SoCG 2020)

We sketch geometric objects J as vectors through the MinDist function, setting the i-th coordinate v_i(J) = inf_{p ∈ J} ‖p-q_i‖ for q_i ∈ Q from a point set Q. Building a vector from these coordinate values induces a simple, effective, and powerful distance: the Euclidean distance between these sketch vectors. This paper shows how large this set Q needs to be under a variety of shapes and scenarios. For hyperplanes we provide direct connection to the sensitivity sampling framework, so relative error can be preserved in d dimensions using |Q| = O(d/ε²). However, for other shapes, we show we need to enforce a minimum distance parameter ρ, and a domain size L. For d=2 the sample size Q then can be Õ((L/ρ) ⋅ 1/ε²). For objects (e.g., trajectories) with at most k pieces this can provide stronger for all approximations with Õ((L/ρ)⋅ k³ / ε²) points. Moreover, with similar size bounds and restrictions, such trajectories can be reconstructed exactly using only these sketch vectors. Cumulatively, these results demonstrate that these MinDist sketch vectors provide an effective and efficient shape model, a compact representation, and a precise representation of geometric objects.

Jeff M. Phillips and Pingfan Tang. Sketched MinDist. In 36th International Symposium on Computational Geometry (SoCG 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 164, pp. 63:1-63:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{phillips_et_al:LIPIcs.SoCG.2020.63, author = {Phillips, Jeff M. and Tang, Pingfan}, title = {{Sketched MinDist}}, booktitle = {36th International Symposium on Computational Geometry (SoCG 2020)}, pages = {63:1--63:16}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-143-6}, ISSN = {1868-8969}, year = {2020}, volume = {164}, editor = {Cabello, Sergio and Chen, Danny Z.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2020.63}, URN = {urn:nbn:de:0030-drops-122218}, doi = {10.4230/LIPIcs.SoCG.2020.63}, annote = {Keywords: curve similarity, sensitivity sampling, sketching} }

Document

**Published in:** LIPIcs, Volume 129, 35th International Symposium on Computational Geometry (SoCG 2019)

We revisit the range sampling problem: the input is a set of points where each point is associated with a real-valued weight. The goal is to store them in a structure such that given a query range and an integer k, we can extract k independent random samples from the points inside the query range, where the probability of sampling a point is proportional to its weight.
This line of work was initiated in 2014 by Hu, Qiao, and Tao and it was later followed up by Afshani and Wei. The first line of work mostly studied unweighted but dynamic version of the problem in one dimension whereas the second result considered the static weighted problem in one dimension as well as the unweighted problem in 3D for halfspace queries.
We offer three main results and some interesting insights that were missed by the previous work: We show that it is possible to build efficient data structures for range sampling queries if we allow the query time to hold in expectation (the first result), or obtain efficient worst-case query bounds by allowing the sampling probability to be approximately proportional to the weight (the second result). The third result is a conditional lower bound that shows essentially one of the previous two concessions is needed. For instance, for the 3D range sampling queries, the first two results give efficient data structures with near-linear space and polylogarithmic query time whereas the lower bound shows with near-linear space the worst-case query time must be close to n^{2/3}, ignoring polylogarithmic factors. Up to our knowledge, this is the first such major gap between the expected and worst-case query time of a range searching problem.

Peyman Afshani and Jeff M. Phillips. Independent Range Sampling, Revisited Again. In 35th International Symposium on Computational Geometry (SoCG 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 129, pp. 4:1-4:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{afshani_et_al:LIPIcs.SoCG.2019.4, author = {Afshani, Peyman and Phillips, Jeff M.}, title = {{Independent Range Sampling, Revisited Again}}, booktitle = {35th International Symposium on Computational Geometry (SoCG 2019)}, pages = {4:1--4:13}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-104-7}, ISSN = {1868-8969}, year = {2019}, volume = {129}, editor = {Barequet, Gill and Wang, Yusu}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2019.4}, URN = {urn:nbn:de:0030-drops-104088}, doi = {10.4230/LIPIcs.SoCG.2019.4}, annote = {Keywords: Range Searching, Data Structures, Sampling} }

Document

**Published in:** LIPIcs, Volume 129, 35th International Symposium on Computational Geometry (SoCG 2019)

The Vapnik-Chervonenkis dimension provides a notion of complexity for systems of sets. If the VC dimension is small, then knowing this can drastically simplify fundamental computational tasks such as classification, range counting, and density estimation through the use of sampling bounds. We analyze set systems where the ground set X is a set of polygonal curves in R^d and the sets {R} are metric balls defined by curve similarity metrics, such as the Fréchet distance and the Hausdorff distance, as well as their discrete counterparts. We derive upper and lower bounds on the VC dimension that imply useful sampling bounds in the setting that the number of curves is large, but the complexity of the individual curves is small. Our upper bounds are either near-quadratic or near-linear in the complexity of the curves that define the ranges and they are logarithmic in the complexity of the curves that define the ground set.

Anne Driemel, Jeff M. Phillips, and Ioannis Psarros. The VC Dimension of Metric Balls Under Fréchet and Hausdorff Distances. In 35th International Symposium on Computational Geometry (SoCG 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 129, pp. 28:1-28:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{driemel_et_al:LIPIcs.SoCG.2019.28, author = {Driemel, Anne and Phillips, Jeff M. and Psarros, Ioannis}, title = {{The VC Dimension of Metric Balls Under Fr\'{e}chet and Hausdorff Distances}}, booktitle = {35th International Symposium on Computational Geometry (SoCG 2019)}, pages = {28:1--28:16}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-104-7}, ISSN = {1868-8969}, year = {2019}, volume = {129}, editor = {Barequet, Gill and Wang, Yusu}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2019.28}, URN = {urn:nbn:de:0030-drops-104329}, doi = {10.4230/LIPIcs.SoCG.2019.28}, annote = {Keywords: VC dimension, Fr\'{e}chet distance, Hausdorff distance} }

Document

**Published in:** LIPIcs, Volume 123, 29th International Symposium on Algorithms and Computation (ISAAC 2018)

Consider a geometric range space (X,A) where X is comprised of the union of a red set R and blue set B. Let Phi(A) define the absolute difference between the fraction of red and fraction of blue points which fall in the range A. The maximum discrepancy range A^* = arg max_{A in (X,A)} Phi(A). Our goal is to find some A^ in (X,A) such that Phi(A^*) - Phi(A^) <= epsilon. We develop general algorithms for this approximation problem for range spaces with bounded VC-dimension, as well as significant improvements for specific geometric range spaces defined by balls, halfspaces, and axis-aligned rectangles. This problem has direct applications in discrepancy evaluation and classification, and we also show an improved reduction to a class of problems in spatial scan statistics.

Michael Matheny and Jeff M. Phillips. Computing Approximate Statistical Discrepancy. In 29th International Symposium on Algorithms and Computation (ISAAC 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 123, pp. 32:1-32:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{matheny_et_al:LIPIcs.ISAAC.2018.32, author = {Matheny, Michael and Phillips, Jeff M.}, title = {{Computing Approximate Statistical Discrepancy}}, booktitle = {29th International Symposium on Algorithms and Computation (ISAAC 2018)}, pages = {32:1--32:13}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-094-1}, ISSN = {1868-8969}, year = {2018}, volume = {123}, editor = {Hsu, Wen-Lian and Lee, Der-Tsai and Liao, Chung-Shou}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2018.32}, URN = {urn:nbn:de:0030-drops-99800}, doi = {10.4230/LIPIcs.ISAAC.2018.32}, annote = {Keywords: Scan Statistics, Discrepancy, Rectangles} }

Document

**Published in:** LIPIcs, Volume 112, 26th Annual European Symposium on Algorithms (ESA 2018)

We develop, analyze, implement, and compare new algorithms for creating epsilon-samples of range spaces defined by halfspaces which have size sub-quadratic in 1/epsilon, and have runtime linear in the input size and near-quadratic in 1/epsilon. The key to our solution is an efficient construction of partition trees. Despite not requiring any techniques developed after the early 1990s, apparently such a result was never explicitly described. We demonstrate that our implementations, including new implementations of several variants of partition trees, do indeed run in time linear in the input, appear to run linear in output size, and observe smaller error for the same size sample compared to the ubiquitous random sample (which requires size quadratic in 1/epsilon). This result has direct applications in speeding up discrepancy evaluation, approximate range counting, and spatial anomaly detection.

Michael Matheny and Jeff M. Phillips. Practical Low-Dimensional Halfspace Range Space Sampling. In 26th Annual European Symposium on Algorithms (ESA 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 112, pp. 62:1-62:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{matheny_et_al:LIPIcs.ESA.2018.62, author = {Matheny, Michael and Phillips, Jeff M.}, title = {{Practical Low-Dimensional Halfspace Range Space Sampling}}, booktitle = {26th Annual European Symposium on Algorithms (ESA 2018)}, pages = {62:1--62:14}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-081-1}, ISSN = {1868-8969}, year = {2018}, volume = {112}, editor = {Azar, Yossi and Bast, Hannah and Herman, Grzegorz}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2018.62}, URN = {urn:nbn:de:0030-drops-95250}, doi = {10.4230/LIPIcs.ESA.2018.62}, annote = {Keywords: Partitions, Range Spaces, Sampling, Halfspaces} }

Document

**Published in:** LIPIcs, Volume 99, 34th International Symposium on Computational Geometry (SoCG 2018)

Robust estimators, like the median of a point set, are important for data analysis in the presence of outliers. We study robust estimators for locationally uncertain points with discrete distributions. That is, each point in a data set has a discrete probability distribution describing its location. The probabilistic nature of uncertain data makes it challenging to compute such estimators, since the true value of the estimator is now described by a distribution rather than a single point. We show how to construct and estimate the distribution of the median of a point set. Building the approximate support of the distribution takes near-linear time, and assigning probability to that support takes quadratic time. We also develop a general approximation technique for distributions of robust estimators with respect to ranges with bounded VC dimension. This includes the geometric median for high dimensions and the Siegel estimator for linear regression.

Kevin Buchin, Jeff M. Phillips, and Pingfan Tang. Approximating the Distribution of the Median and other Robust Estimators on Uncertain Data. In 34th International Symposium on Computational Geometry (SoCG 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 99, pp. 16:1-16:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{buchin_et_al:LIPIcs.SoCG.2018.16, author = {Buchin, Kevin and Phillips, Jeff M. and Tang, Pingfan}, title = {{Approximating the Distribution of the Median and other Robust Estimators on Uncertain Data}}, booktitle = {34th International Symposium on Computational Geometry (SoCG 2018)}, pages = {16:1--16:14}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-066-8}, ISSN = {1868-8969}, year = {2018}, volume = {99}, editor = {Speckmann, Bettina and T\'{o}th, Csaba D.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2018.16}, URN = {urn:nbn:de:0030-drops-87292}, doi = {10.4230/LIPIcs.SoCG.2018.16}, annote = {Keywords: Uncertain Data, Robust Estimators, Geometric Median, Tukey Median} }

Document

**Published in:** LIPIcs, Volume 99, 34th International Symposium on Computational Geometry (SoCG 2018)

We construct near-optimal coresets for kernel density estimate for points in R^d when the kernel is positive definite. Specifically we show a polynomial time construction for a coreset of size O(sqrt{d log (1/epsilon)}/epsilon), and we show a near-matching lower bound of size Omega(sqrt{d}/epsilon). The upper bound is a polynomial in 1/epsilon improvement when d in [3,1/epsilon^2) (for all kernels except the Gaussian kernel which had a previous upper bound of O((1/epsilon) log^d (1/epsilon))) and the lower bound is the first known lower bound to depend on d for this problem. Moreover, the upper bound restriction that the kernel is positive definite is significant in that it applies to a wide-variety of kernels, specifically those most important for machine learning. This includes kernels for information distances and the sinc kernel which can be negative.

Jeff M. Phillips and Wai Ming Tai. Near-Optimal Coresets of Kernel Density Estimates. In 34th International Symposium on Computational Geometry (SoCG 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 99, pp. 66:1-66:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{phillips_et_al:LIPIcs.SoCG.2018.66, author = {Phillips, Jeff M. and Tai, Wai Ming}, title = {{Near-Optimal Coresets of Kernel Density Estimates}}, booktitle = {34th International Symposium on Computational Geometry (SoCG 2018)}, pages = {66:1--66:13}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-066-8}, ISSN = {1868-8969}, year = {2018}, volume = {99}, editor = {Speckmann, Bettina and T\'{o}th, Csaba D.}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2018.66}, URN = {urn:nbn:de:0030-drops-87797}, doi = {10.4230/LIPIcs.SoCG.2018.66}, annote = {Keywords: Coresets, Kernel Density Estimate, Discrepancy} }

Document

**Published in:** LIPIcs, Volume 57, 24th Annual European Symposium on Algorithms (ESA 2016)

With the dramatic growth in the number of application domains that generate probabilistic, noisy and uncertain data, there has been an increasing interest in designing algorithms for geometric or combinatorial optimization problems over such data. In this paper, we initiate the study of constructing epsilon-kernel coresets for uncertain points. We consider uncertainty in the existential model where each point's location is fixed but only occurs with a certain probability, and the locational model where each point has a probability distribution describing its location. An epsilon-kernel coreset approximates the width of a point set in any direction. We consider approximating the expected width (an epsilon-EXP-KERNEL), as well as the probability distribution on the width (an (epsilon, tau)-QUANT-KERNEL) for any direction. We show that there exists a set of O(epsilon^{-(d-1)/2}) deterministic points which approximate the expected width under the existential and locational models, and we provide efficient algorithms for constructing such coresets. We show, however, it is not always possible to find a subset of the original uncertain points which provides such an approximation. However, if the existential probability of each point is lower bounded by a constant, an epsilon-EXP-KERNEL is still possible. We also provide efficient algorithms for construct an (epsilon, tau)-QUANT-KERNEL coreset in nearly linear time. Our techniques utilize or connect to several important notions in probability and geometry, such as Kolmogorov distances, VC uniform convergence and Tukey depth, and may be useful in other geometric optimization problem in stochastic settings. Finally, combining with known techniques, we show a few applications to approximating the extent of uncertain functions, maintaining extent measures for stochastic moving points and some shape fitting problems under uncertainty.

Lingxiao Huang, Jian Li, Jeff M. Phillips, and Haitao Wang. epsilon-Kernel Coresets for Stochastic Points. In 24th Annual European Symposium on Algorithms (ESA 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 57, pp. 50:1-50:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@InProceedings{huang_et_al:LIPIcs.ESA.2016.50, author = {Huang, Lingxiao and Li, Jian and Phillips, Jeff M. and Wang, Haitao}, title = {{epsilon-Kernel Coresets for Stochastic Points}}, booktitle = {24th Annual European Symposium on Algorithms (ESA 2016)}, pages = {50:1--50:18}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-015-6}, ISSN = {1868-8969}, year = {2016}, volume = {57}, editor = {Sankowski, Piotr and Zaroliagis, Christos}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2016.50}, URN = {urn:nbn:de:0030-drops-63921}, doi = {10.4230/LIPIcs.ESA.2016.50}, annote = {Keywords: e-kernel, coreset, stochastic point, shape fitting} }

Document

**Published in:** LIPIcs, Volume 34, 31st International Symposium on Computational Geometry (SoCG 2015)

We show that geometric inference of a point cloud can be calculated by examining its kernel density estimate with a Gaussian kernel. This allows one to consider kernel density estimates, which are robust to spatial noise, subsampling, and approximate computation in comparison to raw point sets. This is achieved by examining the sublevel sets of the kernel distance, which isomorphically map to superlevel sets of the kernel density estimate. We prove new properties about the kernel distance, demonstrating stability results and allowing it to inherit reconstruction results from recent advances in distance-based topological reconstruction. Moreover, we provide an algorithm to estimate its topology using weighted Vietoris-Rips complexes.

Jeff M. Phillips, Bei Wang, and Yan Zheng. Geometric Inference on Kernel Density Estimates. In 31st International Symposium on Computational Geometry (SoCG 2015). Leibniz International Proceedings in Informatics (LIPIcs), Volume 34, pp. 857-871, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2015)

Copy BibTex To Clipboard

@InProceedings{phillips_et_al:LIPIcs.SOCG.2015.857, author = {Phillips, Jeff M. and Wang, Bei and Zheng, Yan}, title = {{Geometric Inference on Kernel Density Estimates}}, booktitle = {31st International Symposium on Computational Geometry (SoCG 2015)}, pages = {857--871}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-939897-83-5}, ISSN = {1868-8969}, year = {2015}, volume = {34}, editor = {Arge, Lars and Pach, J\'{a}nos}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.SOCG.2015.857}, URN = {urn:nbn:de:0030-drops-51349}, doi = {10.4230/LIPIcs.SOCG.2015.857}, annote = {Keywords: topological data analysis, kernel density estimate, kernel distance} }

X

Feedback for Dagstuhl Publishing

Feedback submitted

Please try again later or send an E-mail