DROPS

Document

APPROX

DOI: 10.4230/LIPIcs.APPROX/RANDOM.2024.4

Hybrid k-Clustering: Blending k-Median and k-Center

Authors: Fedor V. Fomin, Petr A. Golovach, Tanmay Inamdar, Saket Saurabh, and Meirav Zehavi

Published in: LIPIcs, Volume 317, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024)

Abstract

We propose a novel clustering model encompassing two well-known clustering models: k-center clustering and k-median clustering. In the Hybrid k-Clustering problem, given a set P of points in ℝ^d, an integer k, and a non-negative real r, our objective is to position k closed balls of radius r to minimize the sum of distances from points not covered by the balls to their closest balls. Equivalently, we seek an optimal L₁-fitting of a union of k balls of radius r to a set of points in the Euclidean space. When r = 0, this corresponds to k-median; when the minimum sum is zero, indicating complete coverage of all points, it is k-center. Our primary result is a bicriteria approximation algorithm that, for a given ε > 0, produces a hybrid k-clustering with balls of radius (1+ε)r. This algorithm achieves a cost at most 1+ε of the optimum, and it operates in time 2^{(kd/ε)^𝒪(1)} ⋅ n^𝒪(1). Notably, considering the established lower bounds on k-center and k-median, our bicriteria approximation stands as the best possible result for Hybrid k-Clustering.

Cite as

Fedor V. Fomin, Petr A. Golovach, Tanmay Inamdar, Saket Saurabh, and Meirav Zehavi. Hybrid k-Clustering: Blending k-Median and k-Center. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 317, pp. 4:1-4:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{fomin_et_al:LIPIcs.APPROX/RANDOM.2024.4,
  author =	{Fomin, Fedor V. and Golovach, Petr A. and Inamdar, Tanmay and Saurabh, Saket and Zehavi, Meirav},
  title =	{{Hybrid k-Clustering: Blending k-Median and k-Center}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2024)},
  pages =	{4:1--4:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-348-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{317},
  editor =	{Kumar, Amit and Ron-Zewi, Noga},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX/RANDOM.2024.4},
  URN =		{urn:nbn:de:0030-drops-209975},
  doi =		{10.4230/LIPIcs.APPROX/RANDOM.2024.4},
  annote =	{Keywords: clustering, k-center, k-median, Euclidean space, fpt approximation}
}

Document

DOI: 10.4230/LIPIcs.WABI.2024.11

The mod-minimizer: A Simple and Efficient Sampling Algorithm for Long k-mers

Authors: Ragnar Groot Koerkamp and Giulio Ermanno Pibiri

Published in: LIPIcs, Volume 312, 24th International Workshop on Algorithms in Bioinformatics (WABI 2024)

Abstract

Motivation. Given a string S, a minimizer scheme is an algorithm defined by a triple (k,w,𝒪) that samples a subset of k-mers (k-long substrings) from a string S. Specifically, it samples the smallest k-mer according to the order 𝒪 from each window of w consecutive k-mers in S. Because consecutive windows can sample the same k-mer, the set of the sampled k-mers is typically much smaller than S. More generally, we consider substring sampling algorithms that respect a window guarantee: at least one k-mer must be sampled from every window of w consecutive k-mers. As a sampled k-mer is uniquely identified by its absolute position in S, we can define the density of a sampling algorithm as the fraction of distinct sampled positions. Good methods have low density which, by respecting the window guarantee, is lower bounded by 1/w. It is however difficult to design a sequence-agnostic algorithm with provably optimal density. In practice, the order 𝒪 is usually implemented using a pseudo-random hash function to obtain the so-called random minimizer. This scheme is simple to implement, very fast to compute even in streaming fashion, and easy to analyze. However, its density is almost a factor of 2 away from the lower bound for large windows. Methods. In this work we introduce mod-sampling, a two-step sampling algorithm to obtain new minimizer schemes. Given a (small) parameter t, the mod-sampling algorithm finds the position p of the smallest t-mer in a window. It then samples the k-mer at position pod w. The lr-minimizer uses t = k-w and the mod-minimizer uses t≡ k (mod w). Results. These new schemes have provably lower density than random minimizers and other schemes when k is large compared to w, while being as fast to compute. Importantly, the mod-minimizer achieves optimal density when k → ∞. Although the mod-minimizer is not the first method to achieve optimal density for large k, its proof of optimality is simpler than previous work. We provide pseudocode for a number of other methods and compare to them. In practice, the mod-minimizer has considerably lower density than the random minimizer and other state-of-the-art methods, like closed syncmers and miniception, when k > w. We plugged the mod-minimizer into SSHash, a k-mer dictionary based on minimizers. For default parameters (w,k) = (11,21), space usage decreases by 15% when indexing the whole human genome (GRCh38), while maintaining its fast query time.

Cite as

Ragnar Groot Koerkamp and Giulio Ermanno Pibiri. The mod-minimizer: A Simple and Efficient Sampling Algorithm for Long k-mers. In 24th International Workshop on Algorithms in Bioinformatics (WABI 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 312, pp. 11:1-11:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{grootkoerkamp_et_al:LIPIcs.WABI.2024.11,
  author =	{Groot Koerkamp, Ragnar and Pibiri, Giulio Ermanno},
  title =	{{The mod-minimizer: A Simple and Efficient Sampling Algorithm for Long k-mers}},
  booktitle =	{24th International Workshop on Algorithms in Bioinformatics (WABI 2024)},
  pages =	{11:1--11:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-340-9},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{312},
  editor =	{Pissis, Solon P. and Sung, Wing-Kin},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2024.11},
  URN =		{urn:nbn:de:0030-drops-206552},
  doi =		{10.4230/LIPIcs.WABI.2024.11},
  annote =	{Keywords: Minimizers, Randomized algorithms, Sketching, Hashing}
}

Document

DOI: 10.4230/LIPIcs.MFCS.2024.69

Scheduling with Locality by Routing

Authors: Alison Hsiang-Hsuan Liu and Fu-Hong Liu

Published in: LIPIcs, Volume 306, 49th International Symposium on Mathematical Foundations of Computer Science (MFCS 2024)

Abstract

This work examines a strongly NP-hard routing problem on trees, in which multiple servers need to serve a given set of requests (on vertices), where the routes of the servers start from a common source and end at their respective terminals. Each server can travel free of cost on its source-to-terminal path but has to pay for travel on other edges. The objective is to minimize the maximum cost over all servers. As the servers may pay different costs for traveling through a common edge, balancing the loads of the servers can be difficult. We propose a polynomial-time 4-approximation algorithm that applies the parametric pruning framework but consists of two phases. The first phase of the algorithm partitions the requests into packets, and the second phase of the algorithm assigns the packets to the servers. Unlike the standard parametric pruning techniques, the challenge of our algorithm design and analysis is to harmoniously relate the quality of the partition in the first phase, the balances of the servers' loads in the second phase, and the hypothetical optimal values of the framework. For the problem in general graphs, we show that there is no algorithm better than 2-approximate unless P = NP. The problem is a generalization of unrelated machine scheduling and other classic scheduling problems. It also models scheduling problems where the job processing times depend on the machine serving the job and the other jobs served by that machine. This modeling provides a framework that physicalizes scheduling problems through the graph’s point of view.

Cite as

Alison Hsiang-Hsuan Liu and Fu-Hong Liu. Scheduling with Locality by Routing. In 49th International Symposium on Mathematical Foundations of Computer Science (MFCS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 306, pp. 69:1-69:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{liu_et_al:LIPIcs.MFCS.2024.69,
  author =	{Liu, Alison Hsiang-Hsuan and Liu, Fu-Hong},
  title =	{{Scheduling with Locality by Routing}},
  booktitle =	{49th International Symposium on Mathematical Foundations of Computer Science (MFCS 2024)},
  pages =	{69:1--69:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-335-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{306},
  editor =	{Kr\'{a}lovi\v{c}, Rastislav and Ku\v{c}era, Anton{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.MFCS.2024.69},
  URN =		{urn:nbn:de:0030-drops-206250},
  doi =		{10.4230/LIPIcs.MFCS.2024.69},
  annote =	{Keywords: Makespan minimization, Approximation algorithms, Routing problems, Parametric pruning framework}
}

Document

DOI: 10.4230/LIPIcs.MFCS.2024.84

Faster Approximation Schemes for (Constrained) k-Means with Outliers

Authors: Zhen Zhang, Junyu Huang, and Qilong Feng

Published in: LIPIcs, Volume 306, 49th International Symposium on Mathematical Foundations of Computer Science (MFCS 2024)

Abstract

Given a set of n points in ℝ^d and two positive integers k and m, the Euclidean k-means with outliers problem aims to remove at most m points, referred to as outliers, and minimize the k-means cost function for the remaining points. Developing algorithms for this problem remains an active area of research due to its prevalence in applications involving noisy data. In this paper, we give a (1+ε)-approximation algorithm that runs in n²d((k+m)ε^{-1})^O(kε^{-1}) time for the problem. When combined with a coreset construction method, the running time of the algorithm can be improved to be linear in n. For the case where k is a constant, this represents the first polynomial-time approximation scheme for the problem: Existing algorithms with the same approximation guarantee run in polynomial time only when both k and m are constants. Furthermore, our approach generalizes to variants of k-means with outliers incorporating additional constraints on instances, such as those related to capacities and fairness.

Cite as

Zhen Zhang, Junyu Huang, and Qilong Feng. Faster Approximation Schemes for (Constrained) k-Means with Outliers. In 49th International Symposium on Mathematical Foundations of Computer Science (MFCS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 306, pp. 84:1-84:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{zhang_et_al:LIPIcs.MFCS.2024.84,
  author =	{Zhang, Zhen and Huang, Junyu and Feng, Qilong},
  title =	{{Faster Approximation Schemes for (Constrained) k-Means with Outliers}},
  booktitle =	{49th International Symposium on Mathematical Foundations of Computer Science (MFCS 2024)},
  pages =	{84:1--84:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-335-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{306},
  editor =	{Kr\'{a}lovi\v{c}, Rastislav and Ku\v{c}era, Anton{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.MFCS.2024.84},
  URN =		{urn:nbn:de:0030-drops-206408},
  doi =		{10.4230/LIPIcs.MFCS.2024.84},
  annote =	{Keywords: Approximation algorithms, clustering}
}

Document

DOI: 10.4230/LIPIcs.CCC.2024.27

Baby PIH: Parameterized Inapproximability of Min CSP

Authors: Venkatesan Guruswami, Xuandi Ren, and Sai Sandeep

Published in: LIPIcs, Volume 300, 39th Computational Complexity Conference (CCC 2024)

Abstract

The Parameterized Inapproximability Hypothesis (PIH) is the analog of the PCP theorem in the world of parameterized complexity. It asserts that no FPT algorithm can distinguish a satisfiable 2CSP instance from one which is only (1-ε)-satisfiable (where the parameter is the number of variables) for some constant 0 < ε < 1. We consider a minimization version of CSPs (Min-CSP), where one may assign r values to each variable, and the goal is to ensure that every constraint is satisfied by some choice among the r × r pairs of values assigned to its variables (call such a CSP instance r-list-satisfiable). We prove the following strong parameterized inapproximability for Min CSP: For every r ≥ 1, it is W[1]-hard to tell if a 2CSP instance is satisfiable or is not even r-list-satisfiable. We refer to this statement as "Baby PIH", following the recently proved Baby PCP Theorem (Barto and Kozik, 2021). Our proof adapts the combinatorial arguments underlying the Baby PCP theorem, overcoming some basic obstacles that arise in the parameterized setting. Furthermore, our reduction runs in time polynomially bounded in both the number of variables and the alphabet size, and thus implies the Baby PCP theorem as well.

Cite as

Venkatesan Guruswami, Xuandi Ren, and Sai Sandeep. Baby PIH: Parameterized Inapproximability of Min CSP. In 39th Computational Complexity Conference (CCC 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 300, pp. 27:1-27:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{guruswami_et_al:LIPIcs.CCC.2024.27,
  author =	{Guruswami, Venkatesan and Ren, Xuandi and Sandeep, Sai},
  title =	{{Baby PIH: Parameterized Inapproximability of Min CSP}},
  booktitle =	{39th Computational Complexity Conference (CCC 2024)},
  pages =	{27:1--27:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-331-7},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{300},
  editor =	{Santhanam, Rahul},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CCC.2024.27},
  URN =		{urn:nbn:de:0030-drops-204237},
  doi =		{10.4230/LIPIcs.CCC.2024.27},
  annote =	{Keywords: Parameterized Inapproximability Hypothesis, Constraint Satisfaction Problems}
}

Document

DOI: 10.4230/LIPIcs.SEA.2024.7

Local Search k-means++ with Foresight

Authors: Theo Conrads, Lukas Drexler, Joshua Könen, Daniel R. Schmidt, and Melanie Schmidt

Published in: LIPIcs, Volume 301, 22nd International Symposium on Experimental Algorithms (SEA 2024)

Abstract

Since its introduction in 1957, Lloyd’s algorithm for k-means clustering has been extensively studied and has undergone several improvements. While in its original form it does not guarantee any approximation factor at all, Arthur and Vassilvitskii (SODA 2007) proposed k-means++ which enhances Lloyd’s algorithm by a seeding method which guarantees a 𝒪(log k)-approximation in expectation. More recently, Lattanzi and Sohler (ICML 2019) proposed LS++ which further improves the solution quality of k-means++ by local search techniques to obtain a 𝒪(1)-approximation. On the practical side, the greedy variant of k-means++ is often used although its worst-case behaviour is provably worse than for the standard k-means++ variant. We investigate how to improve LS++ further in practice. We study two options for improving the practical performance: (a) Combining LS++ with greedy k-means++ instead of k-means++, and (b) Improving LS++ by better entangling it with Lloyd’s algorithm. Option (a) worsens the theoretical guarantees of k-means++ but improves the practical quality also in combination with LS++ as we confirm in our experiments. Option (b) is our new algorithm, Foresight LS++. We experimentally show that FLS++ improves upon the solution quality of LS++. It retains its asymptotic runtime and its worst-case approximation bounds.

Cite as

Theo Conrads, Lukas Drexler, Joshua Könen, Daniel R. Schmidt, and Melanie Schmidt. Local Search k-means++ with Foresight. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 7:1-7:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{conrads_et_al:LIPIcs.SEA.2024.7,
  author =	{Conrads, Theo and Drexler, Lukas and K\"{o}nen, Joshua and Schmidt, Daniel R. and Schmidt, Melanie},
  title =	{{Local Search k-means++ with Foresight}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{7:1--7:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.7},
  URN =		{urn:nbn:de:0030-drops-203727},
  doi =		{10.4230/LIPIcs.SEA.2024.7},
  annote =	{Keywords: k-means clustering, kmeans++, greedy, local search}
}

Document

Track A: Algorithms, Complexity and Games

DOI: 10.4230/LIPIcs.ICALP.2024.6

Parameterized Approximation For Robust Clustering in Discrete Geometric Spaces

Authors: Fateme Abbasi, Sandip Banerjee, Jarosław Byrka, Parinya Chalermsook, Ameet Gadekar, Kamyar Khodamoradi, Dániel Marx, Roohani Sharma, and Joachim Spoerhase

Published in: LIPIcs, Volume 297, 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)

Abstract

We consider the well-studied Robust (k,z)-Clustering problem, which generalizes the classic k-Median, k-Means, and k-Center problems and arises in the domains of robust optimization [Anthony, Goyal, Gupta, Nagarajan, Math. Oper. Res. 2010] and in algorithmic fairness [Abbasi, Bhaskara, Venkatasubramanian, 2021 & Ghadiri, Samadi, Vempala, 2022]. Given a constant z ≥ 1, the input to Robust (k,z)-Clustering is a set P of n points in a metric space (M,δ), a weight function w: P → ℝ_{≥ 0} and a positive integer k. Further, each point belongs to one (or more) of the m many different groups S_1,S_2,…,S_m ⊆ P. Our goal is to find a set X of k centers such that max_{i ∈ [m]} ∑_{p ∈ S_i} w(p) δ(p,X)^z is minimized. Complementing recent work on this problem, we give a comprehensive understanding of the parameterized approximability of the problem in geometric spaces where the parameter is the number k of centers. We prove the following results: [(i)] 1) For a universal constant η₀ > 0.0006, we devise a 3^z(1-η₀)-factor FPT approximation algorithm for Robust (k,z)-Clustering in discrete high-dimensional Euclidean spaces where the set of potential centers is finite. This shows that the lower bound of 3^z for general metrics [Goyal, Jaiswal, Inf. Proc. Letters, 2023] no longer holds when the metric has geometric structure. 2) We show that Robust (k,z)-Clustering in discrete Euclidean spaces is (√{3/2}- o(1))-hard to approximate for FPT algorithms, even if we consider the special case k-Center in logarithmic dimensions. This rules out a (1+ε)-approximation algorithm running in time f(k,ε)poly(m,n) (also called efficient parameterized approximation scheme or EPAS), giving a striking contrast with the recent EPAS for the continuous setting where centers can be placed anywhere in the space [Abbasi et al., FOCS'23]. 3) However, we obtain an EPAS for Robust (k,z)-Clustering in discrete Euclidean spaces when the dimension is sublogarithmic (for the discrete problem, earlier work [Abbasi et al., FOCS'23] provides an EPAS only in dimension o(log log n)). Our EPAS works also for metrics of sub-logarithmic doubling dimension.

Cite as

Fateme Abbasi, Sandip Banerjee, Jarosław Byrka, Parinya Chalermsook, Ameet Gadekar, Kamyar Khodamoradi, Dániel Marx, Roohani Sharma, and Joachim Spoerhase. Parameterized Approximation For Robust Clustering in Discrete Geometric Spaces. In 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 297, pp. 6:1-6:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{abbasi_et_al:LIPIcs.ICALP.2024.6,
  author =	{Abbasi, Fateme and Banerjee, Sandip and Byrka, Jaros{\l}aw and Chalermsook, Parinya and Gadekar, Ameet and Khodamoradi, Kamyar and Marx, D\'{a}niel and Sharma, Roohani and Spoerhase, Joachim},
  title =	{{Parameterized Approximation For Robust Clustering in Discrete Geometric Spaces}},
  booktitle =	{51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)},
  pages =	{6:1--6:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-322-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{297},
  editor =	{Bringmann, Karl and Grohe, Martin and Puppis, Gabriele and Svensson, Ola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2024.6},
  URN =		{urn:nbn:de:0030-drops-201494},
  doi =		{10.4230/LIPIcs.ICALP.2024.6},
  annote =	{Keywords: Clustering, approximation algorithms, parameterized complexity}
}

@InProceedings{abbasi_et_al:LIPIcs.ICALP.2024.6,
  author =	{Abbasi, Fateme and Banerjee, Sandip and Byrka, Jaros{\l}aw and Chalermsook, Parinya and Gadekar, Ameet and Khodamoradi, Kamyar and Marx, D\'{a}niel and Sharma, Roohani and Spoerhase, Joachim},
  title =	{{Parameterized Approximation For Robust Clustering in Discrete Geometric Spaces}},
  booktitle =	{51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)},
  pages =	{6:1--6:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-322-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{297},
  editor =	{Bringmann, Karl and Grohe, Martin and Puppis, Gabriele and Svensson, Ola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2024.6},
  URN =		{urn:nbn:de:0030-drops-201494},
  doi =		{10.4230/LIPIcs.ICALP.2024.6},
  annote =	{Keywords: Clustering, approximation algorithms, parameterized complexity}
}

Document

Media Exposition

DOI: 10.4230/LIPIcs.SoCG.2024.92

Ipelets for the Convex Polygonal Geometry (Media Exposition)

Authors: Nithin Parepally, Ainesh Chatterjee, Auguste H. Gezalyan, Hongyang Du, Sukrit Mangla, Kenny Wu, Sarah Hwang, and David M. Mount

Published in: LIPIcs, Volume 293, 40th International Symposium on Computational Geometry (SoCG 2024)

Abstract

There are many structures, both classical and modern, involving convex polygonal geometries whose deeper understanding would be facilitated through interactive visualizations. The Ipe extensible drawing editor, developed by Otfried Cheong, is a widely used software system for generating geometric figures. One of its features is the capability to extend its functionality through programs called Ipelets. In this media submission, we showcase a collection of new Ipelets that construct a variety of geometric objects based on polygonal geometries. These include Macbeath regions, metric balls in the forward and reverse Funk distance, metric balls in the Hilbert metric, polar bodies, the minimum enclosing ball of a point set, and minimum spanning trees in both the Funk and Hilbert metrics. We also include a number of utilities on convex polygons, including union, intersection, subtraction, and Minkowski sum (previously implemented as a CGAL Ipelet).

Cite as

Nithin Parepally, Ainesh Chatterjee, Auguste H. Gezalyan, Hongyang Du, Sukrit Mangla, Kenny Wu, Sarah Hwang, and David M. Mount. Ipelets for the Convex Polygonal Geometry (Media Exposition). In 40th International Symposium on Computational Geometry (SoCG 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 293, pp. 92:1-92:7, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{parepally_et_al:LIPIcs.SoCG.2024.92,
  author =	{Parepally, Nithin and Chatterjee, Ainesh and Gezalyan, Auguste H. and Du, Hongyang and Mangla, Sukrit and Wu, Kenny and Hwang, Sarah and Mount, David M.},
  title =	{{Ipelets for the Convex Polygonal Geometry}},
  booktitle =	{40th International Symposium on Computational Geometry (SoCG 2024)},
  pages =	{92:1--92:7},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-316-4},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{293},
  editor =	{Mulzer, Wolfgang and Phillips, Jeff M.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2024.92},
  URN =		{urn:nbn:de:0030-drops-200375},
  doi =		{10.4230/LIPIcs.SoCG.2024.92},
  annote =	{Keywords: Hilbert metric, Macbeath Regions, Polar Bodies, Convexity}
}

Document

DOI: 10.4230/LIPIcs.SWAT.2024.25

Delaunay Triangulations in the Hilbert Metric

Authors: Auguste H. Gezalyan, Soo H. Kim, Carlos Lopez, Daniel Skora, Zofia Stefankovic, and David M. Mount

Published in: LIPIcs, Volume 294, 19th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2024)

Abstract

The Hilbert metric is a distance function defined for points lying within the interior of a convex body. It arises in the analysis and processing of convex bodies, machine learning, and quantum information theory. In this paper, we show how to adapt the Euclidean Delaunay triangulation to the Hilbert geometry defined by a convex polygon in the plane. We analyze the geometric properties of the Hilbert Delaunay triangulation, which has some notable differences with respect to the Euclidean case, including the fact that the triangulation does not necessarily cover the convex hull of the point set. We also introduce the notion of a Hilbert ball at infinity, which is a Hilbert metric ball centered on the boundary of the convex polygon. We present a simple randomized incremental algorithm that computes the Hilbert Delaunay triangulation for a set of n points in the Hilbert geometry defined by a convex m-gon. The algorithm runs in O(n (log n + log³ m)) expected time. In addition we introduce the notion of the Hilbert hull of a set of points, which we define to be the region covered by their Hilbert Delaunay triangulation. We present an algorithm for computing the Hilbert hull in time O(n h log² m), where h is the number of points on the hull’s boundary.

Cite as

Auguste H. Gezalyan, Soo H. Kim, Carlos Lopez, Daniel Skora, Zofia Stefankovic, and David M. Mount. Delaunay Triangulations in the Hilbert Metric. In 19th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 294, pp. 25:1-25:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{gezalyan_et_al:LIPIcs.SWAT.2024.25,
  author =	{Gezalyan, Auguste H. and Kim, Soo H. and Lopez, Carlos and Skora, Daniel and Stefankovic, Zofia and Mount, David M.},
  title =	{{Delaunay Triangulations in the Hilbert Metric}},
  booktitle =	{19th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2024)},
  pages =	{25:1--25:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-318-8},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{294},
  editor =	{Bodlaender, Hans L.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SWAT.2024.25},
  URN =		{urn:nbn:de:0030-drops-200657},
  doi =		{10.4230/LIPIcs.SWAT.2024.25},
  annote =	{Keywords: Delaunay Triangulations, Hilbert metric, convexity, randomized algorithms}
}

Document

DOI: 10.4230/LIPIcs.ESA.2023.5

Smooth Distance Approximation

Authors: Ahmed Abdelkader and David M. Mount

Published in: LIPIcs, Volume 274, 31st Annual European Symposium on Algorithms (ESA 2023)

Abstract

Traditional problems in computational geometry involve aspects that are both discrete and continuous. One such example is nearest-neighbor searching, where the input is discrete, but the result depends on distances, which vary continuously. In many real-world applications of geometric data structures, it is assumed that query results are continuous, free of jump discontinuities. This is at odds with many modern data structures in computational geometry, which employ approximations to achieve efficiency, but these approximations often suffer from discontinuities. In this paper, we present a general method for transforming an approximate but discontinuous data structure into one that produces a smooth approximation, while matching the asymptotic space efficiencies of the original. We achieve this by adapting an approach called the partition-of-unity method, which smoothly blends multiple local approximations into a single smooth global approximation. We illustrate the use of this technique in a specific application of approximating the distance to the boundary of a convex polytope in ℝ^d from any point in its interior. We begin by developing a novel data structure that efficiently computes an absolute ε-approximation to this query in time O(log (1/ε)) using O(1/ε^{d/2}) storage space. Then, we proceed to apply the proposed partition-of-unity blending to guarantee the smoothness of the approximate distance field, establishing optimal asymptotic bounds on the norms of its gradient and Hessian.

Cite as

Ahmed Abdelkader and David M. Mount. Smooth Distance Approximation. In 31st Annual European Symposium on Algorithms (ESA 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 274, pp. 5:1-5:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{abdelkader_et_al:LIPIcs.ESA.2023.5,
  author =	{Abdelkader, Ahmed and Mount, David M.},
  title =	{{Smooth Distance Approximation}},
  booktitle =	{31st Annual European Symposium on Algorithms (ESA 2023)},
  pages =	{5:1--5:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-295-2},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{274},
  editor =	{G{\o}rtz, Inge Li and Farach-Colton, Martin and Puglisi, Simon J. and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2023.5},
  URN =		{urn:nbn:de:0030-drops-186589},
  doi =		{10.4230/LIPIcs.ESA.2023.5},
  annote =	{Keywords: Approximation algorithms, convexity, continuity, partition of unity}
}

Document

DOI: 10.4230/LIPIcs.SoCG.2023.9

Optimal Volume-Sensitive Bounds for Polytope Approximation

Authors: Sunil Arya and David M. Mount

Published in: LIPIcs, Volume 258, 39th International Symposium on Computational Geometry (SoCG 2023)

Abstract

Approximating convex bodies is a fundamental question in geometry and has a wide variety of applications. Consider a convex body K of diameter Δ in ℝ^d for fixed d. The objective is to minimize the number of vertices (alternatively, the number of facets) of an approximating polytope for a given Hausdorff error ε. It is known from classical results of Dudley (1974) and Bronshteyn and Ivanov (1976) that Θ((Δ/ε)^{(d-1)/2}) vertices (alternatively, facets) are both necessary and sufficient. While this bound is tight in the worst case, that of Euclidean balls, it is far from optimal for skinny convex bodies. A natural way to characterize a convex object’s skinniness is in terms of its relationship to the Euclidean ball. Given a convex body K, define its volume diameter Δ_d to be the diameter of a Euclidean ball of the same volume as K, and define its surface diameter Δ_{d-1} analogously for surface area. It follows from generalizations of the isoperimetric inequality that Δ ≥ Δ_{d-1} ≥ Δ_d. Arya, da Fonseca, and Mount (SoCG 2012) demonstrated that the diameter-based bound could be made surface-area sensitive, improving the above bound to O((Δ_{d-1}/ε)^{(d-1)/2}). In this paper, we strengthen this by proving the existence of an approximation with O((Δ_d/ε)^{(d-1)/2}) facets. This improvement is a result of the combination of a number of new ideas. As in prior work, we exploit properties of the original body and its polar dual. In order to obtain a volume-sensitive bound, we explore the following more general problem. Given two convex bodies, one nested within the other, find a low-complexity convex polytope that is sandwiched between them. We show that this problem can be reduced to a covering problem involving a natural intermediate body based on the harmonic mean. Our proof relies on a geometric analysis of a relative notion of fatness involving these bodies.

Cite as

Sunil Arya and David M. Mount. Optimal Volume-Sensitive Bounds for Polytope Approximation. In 39th International Symposium on Computational Geometry (SoCG 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 258, pp. 9:1-9:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{arya_et_al:LIPIcs.SoCG.2023.9,
  author =	{Arya, Sunil and Mount, David M.},
  title =	{{Optimal Volume-Sensitive Bounds for Polytope Approximation}},
  booktitle =	{39th International Symposium on Computational Geometry (SoCG 2023)},
  pages =	{9:1--9:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-273-0},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{258},
  editor =	{Chambers, Erin W. and Gudmundsson, Joachim},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2023.9},
  URN =		{urn:nbn:de:0030-drops-178592},
  doi =		{10.4230/LIPIcs.SoCG.2023.9},
  annote =	{Keywords: Approximation algorithms, convexity, Macbeath regions}
}

Document

DOI: 10.4230/LIPIcs.SoCG.2023.35

Voronoi Diagrams in the Hilbert Metric

Authors: Auguste H. Gezalyan and David M. Mount

Published in: LIPIcs, Volume 258, 39th International Symposium on Computational Geometry (SoCG 2023)

Abstract

The Hilbert metric is a distance function defined for points lying within a convex body. It generalizes the Cayley-Klein model of hyperbolic geometry to any convex set, and it has numerous applications in the analysis and processing of convex bodies. In this paper, we study the geometric and combinatorial properties of the Voronoi diagram of a set of point sites under the Hilbert metric. Given any m-sided convex polygon Ω in the plane, we present two randomized incremental algorithms and one deterministic algorithm. The first randomized algorithm and the deterministic algorithm compute the Voronoi diagram of a set of n point sites. The second randomized algorithm extends this to compute the Voronoi diagram of the set of n sites, each of which may be a point or a line segment. Our algorithms all run in expected time O(m n log n). The algorithms use O(m n) storage, which matches the worst-case combinatorial complexity of the Voronoi diagram in the Hilbert metric.

Cite as

Auguste H. Gezalyan and David M. Mount. Voronoi Diagrams in the Hilbert Metric. In 39th International Symposium on Computational Geometry (SoCG 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 258, pp. 35:1-35:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{gezalyan_et_al:LIPIcs.SoCG.2023.35,
  author =	{Gezalyan, Auguste H. and Mount, David M.},
  title =	{{Voronoi Diagrams in the Hilbert Metric}},
  booktitle =	{39th International Symposium on Computational Geometry (SoCG 2023)},
  pages =	{35:1--35:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-273-0},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{258},
  editor =	{Chambers, Erin W. and Gudmundsson, Joachim},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2023.35},
  URN =		{urn:nbn:de:0030-drops-178851},
  doi =		{10.4230/LIPIcs.SoCG.2023.35},
  annote =	{Keywords: Voronoi diagrams, Hilbert metric, convexity, randomized algorithms}
}

Document

DOI: 10.4230/LIPIcs.ESA.2021.44

Boundary-Sensitive Approach for Approximate Nearest-Neighbor Classification

Authors: Alejandro Flores-Velazco and David M. Mount

Published in: LIPIcs, Volume 204, 29th Annual European Symposium on Algorithms (ESA 2021)

Abstract

The problem of nearest-neighbor classification is a fundamental technique in machine-learning. Given a training set P of n labeled points in ℝ^d, and an approximation parameter 0 < ε ≤ 1/2, any unlabeled query point should be classified with the class of any of its ε-approximate nearest-neighbors in P. Answering these queries efficiently has been the focus of extensive research, proposing techniques that are mainly tailored towards resolving the more general problem of ε-approximate nearest-neighbor search. While the latest can only hope to provide query time and space complexities dependent on n, the problem of nearest-neighbor classification accepts other parameters more suitable to its analysis. Such is the number k_ε of ε-border points, which describes the complexity of boundaries between sets of points of different classes. This paper presents a new data structure called Chromatic AVD. This is the first approach for ε-approximate nearest-neighbor classification whose space and query time complexities are only dependent on ε, k_ε and d, while being independent on both n and Δ, the spread of P.

Cite as

Alejandro Flores-Velazco and David M. Mount. Boundary-Sensitive Approach for Approximate Nearest-Neighbor Classification. In 29th Annual European Symposium on Algorithms (ESA 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 204, pp. 44:1-44:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{floresvelazco_et_al:LIPIcs.ESA.2021.44,
  author =	{Flores-Velazco, Alejandro and Mount, David M.},
  title =	{{Boundary-Sensitive Approach for Approximate Nearest-Neighbor Classification}},
  booktitle =	{29th Annual European Symposium on Algorithms (ESA 2021)},
  pages =	{44:1--44:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-204-4},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{204},
  editor =	{Mutzel, Petra and Pagh, Rasmus and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2021.44},
  URN =		{urn:nbn:de:0030-drops-146252},
  doi =		{10.4230/LIPIcs.ESA.2021.44},
  annote =	{Keywords: approximate nearest-neighbor searching, nearest-neighbor classification, geometric data structures, space-time tradeoffs}
}

Document

DOI: 10.4230/LIPIcs.SoCG.2021.4

Approximate Nearest-Neighbor Search for Line Segments

Authors: Ahmed Abdelkader and David M. Mount

Published in: LIPIcs, Volume 189, 37th International Symposium on Computational Geometry (SoCG 2021)

Abstract

Approximate nearest-neighbor search is a fundamental algorithmic problem that continues to inspire study due its essential role in numerous contexts. In contrast to most prior work, which has focused on point sets, we consider nearest-neighbor queries against a set of line segments in ℝ^d, for constant dimension d. Given a set S of n disjoint line segments in ℝ^d and an error parameter ε > 0, the objective is to build a data structure such that for any query point q, it is possible to return a line segment whose Euclidean distance from q is at most (1+ε) times the distance from q to its nearest line segment. We present a data structure for this problem with storage O((n²/ε^d) log (Δ/ε)) and query time O(log (max(n,Δ)/ε)), where Δ is the spread of the set of segments S. Our approach is based on a covering of space by anisotropic elements, which align themselves according to the orientations of nearby segments.

Cite as

Ahmed Abdelkader and David M. Mount. Approximate Nearest-Neighbor Search for Line Segments. In 37th International Symposium on Computational Geometry (SoCG 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 189, pp. 4:1-4:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)

Copy BibTex To Clipboard

@InProceedings{abdelkader_et_al:LIPIcs.SoCG.2021.4,
  author =	{Abdelkader, Ahmed and Mount, David M.},
  title =	{{Approximate Nearest-Neighbor Search for Line Segments}},
  booktitle =	{37th International Symposium on Computational Geometry (SoCG 2021)},
  pages =	{4:1--4:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-184-9},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{189},
  editor =	{Buchin, Kevin and Colin de Verdi\`{e}re, \'{E}ric},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2021.4},
  URN =		{urn:nbn:de:0030-drops-138039},
  doi =		{10.4230/LIPIcs.SoCG.2021.4},
  annote =	{Keywords: Approximate nearest-neighbor searching, Approximate Voronoi diagrams, Line segments, Macbeath regions}
}

Document

DOI: 10.4230/LIPIcs.ESA.2020.47

Coresets for the Nearest-Neighbor Rule

Authors: Alejandro Flores-Velazco and David M. Mount

Published in: LIPIcs, Volume 173, 28th Annual European Symposium on Algorithms (ESA 2020)

Abstract

Given a training set P of labeled points, the nearest-neighbor rule predicts the class of an unlabeled query point as the label of its closest point in the set. To improve the time and space complexity of classification, a natural question is how to reduce the training set without significantly affecting the accuracy of the nearest-neighbor rule. Nearest-neighbor condensation deals with finding a subset R ⊆ P such that for every point p ∈ P, p’s nearest-neighbor in R has the same label as p. This relates to the concept of coresets, which can be broadly defined as subsets of the set, such that an exact result on the coreset corresponds to an approximate result on the original set. However, the guarantees of a coreset hold for any query point, and not only for the points of the training set. This paper introduces the concept of coresets for nearest-neighbor classification. We extend existing criteria used for condensation, and prove sufficient conditions to correctly classify any query point when using these subsets. Additionally, we prove that finding such subsets of minimum cardinality is NP-hard, and propose quadratic-time approximation algorithms with provable upper-bounds on the size of their selected subsets. Moreover, we show how to improve one of these algorithms to have subquadratic runtime, being the first of this kind for condensation.

Cite as

Alejandro Flores-Velazco and David M. Mount. Coresets for the Nearest-Neighbor Rule. In 28th Annual European Symposium on Algorithms (ESA 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 173, pp. 47:1-47:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{floresvelazco_et_al:LIPIcs.ESA.2020.47,
  author =	{Flores-Velazco, Alejandro and Mount, David M.},
  title =	{{Coresets for the Nearest-Neighbor Rule}},
  booktitle =	{28th Annual European Symposium on Algorithms (ESA 2020)},
  pages =	{47:1--47:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-162-7},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{173},
  editor =	{Grandoni, Fabrizio and Herman, Grzegorz and Sanders, Peter},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2020.47},
  URN =		{urn:nbn:de:0030-drops-129138},
  doi =		{10.4230/LIPIcs.ESA.2020.47},
  annote =	{Keywords: coresets, nearest-neighbor rule, classification, nearest-neighbor condensation, training-set reduction, approximate nearest-neighbor, approximation algorithms}
}

22 Search Results for "Mount, David M."

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Thanks for your feedback!

Could not send message