2 Search Results for "Feldman, Dan"


Document
On k-Means for Segments and Polylines

Authors: Sergio Cabello and Panos Giannopoulos

Published in: LIPIcs, Volume 274, 31st Annual European Symposium on Algorithms (ESA 2023)


Abstract
We study the problem of k-means clustering in the space of straight-line segments in ℝ² under the Hausdorff distance. For this problem, we give a (1+ε)-approximation algorithm that, for an input of n segments, for any fixed k, and with constant success probability, runs in time O(n + ε^{-O(k)} + ε^{-O(k)} ⋅ log^O(k) (ε^{-1})). The algorithm has two main ingredients. Firstly, we express the k-means objective in our metric space as a sum of algebraic functions and use the optimization technique of Vigneron [Antoine Vigneron, 2014] to approximate its minimum. Secondly, we reduce the input size by computing a small size coreset using the sensitivity-based sampling framework by Feldman and Langberg [Dan Feldman and Michael Langberg, 2011; Feldman et al., 2020]. Our results can be extended to polylines of constant complexity with a running time of O(n + ε^{-O(k)}).

Cite as

Sergio Cabello and Panos Giannopoulos. On k-Means for Segments and Polylines. In 31st Annual European Symposium on Algorithms (ESA 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 274, pp. 28:1-28:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{cabello_et_al:LIPIcs.ESA.2023.28,
  author =	{Cabello, Sergio and Giannopoulos, Panos},
  title =	{{On k-Means for Segments and Polylines}},
  booktitle =	{31st Annual European Symposium on Algorithms (ESA 2023)},
  pages =	{28:1--28:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-295-2},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{274},
  editor =	{G{\o}rtz, Inge Li and Farach-Colton, Martin and Puglisi, Simon J. and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2023.28},
  URN =		{urn:nbn:de:0030-drops-186812},
  doi =		{10.4230/LIPIcs.ESA.2023.28},
  annote =	{Keywords: k-means clustering, segments, polylines, Hausdorff distance, Fr\'{e}chet mean}
}
Document
RANDOM
Streaming Coreset Constructions for M-Estimators

Authors: Vladimir Braverman, Dan Feldman, Harry Lang, and Daniela Rus

Published in: LIPIcs, Volume 145, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019)


Abstract
We introduce a new method of maintaining a (k,epsilon)-coreset for clustering M-estimators over insertion-only streams. Let (P,w) be a weighted set (where w : P - > [0,infty) is the weight function) of points in a rho-metric space (meaning a set X equipped with a positive-semidefinite symmetric function D such that D(x,z) <=rho(D(x,y) + D(y,z)) for all x,y,z in X). For any set of points C, we define COST(P,w,C) = sum_{p in P} w(p) min_{c in C} D(p,c). A (k,epsilon)-coreset for (P,w) is a weighted set (Q,v) such that for every set C of k points, (1-epsilon)COST(P,w,C) <= COST(Q,v,C) <= (1+epsilon)COST(P,w,C). Essentially, the coreset (Q,v) can be used in place of (P,w) for all operations concerning the COST function. Coresets, as a method of data reduction, are used to solve fundamental problems in machine learning of streaming and distributed data. M-estimators are functions D(x,y) that can be written as psi(d(x,y)) where ({X}, d) is a true metric (i.e. 1-metric) space. Special cases of M-estimators include the well-known k-median (psi(x) =x) and k-means (psi(x) = x^2) functions. Our technique takes an existing offline construction for an M-estimator coreset and converts it into the streaming setting, where n data points arrive sequentially. To our knowledge, this is the first streaming construction for any M-estimator that does not rely on the merge-and-reduce tree. For example, our coreset for streaming metric k-means uses O(epsilon^{-2} k log k log n) points of storage. The previous state-of-the-art required storing at least O(epsilon^{-2} k log k log^{4} n) points.

Cite as

Vladimir Braverman, Dan Feldman, Harry Lang, and Daniela Rus. Streaming Coreset Constructions for M-Estimators. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 145, pp. 62:1-62:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{braverman_et_al:LIPIcs.APPROX-RANDOM.2019.62,
  author =	{Braverman, Vladimir and Feldman, Dan and Lang, Harry and Rus, Daniela},
  title =	{{Streaming Coreset Constructions for M-Estimators}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019)},
  pages =	{62:1--62:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-125-2},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{145},
  editor =	{Achlioptas, Dimitris and V\'{e}gh, L\'{a}szl\'{o} A.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2019.62},
  URN =		{urn:nbn:de:0030-drops-112778},
  doi =		{10.4230/LIPIcs.APPROX-RANDOM.2019.62},
  annote =	{Keywords: Streaming, Clustering, Coresets}
}
  • Refine by Author
  • 1 Braverman, Vladimir
  • 1 Cabello, Sergio
  • 1 Feldman, Dan
  • 1 Giannopoulos, Panos
  • 1 Lang, Harry
  • Show More...

  • Refine by Classification
  • 1 Information systems → Query optimization
  • 1 Theory of computation → Computational geometry
  • 1 Theory of computation → Facility location and clustering
  • 1 Theory of computation → Streaming models

  • Refine by Keyword
  • 1 Clustering
  • 1 Coresets
  • 1 Fréchet mean
  • 1 Hausdorff distance
  • 1 Streaming
  • Show More...

  • Refine by Type
  • 2 document

  • Refine by Publication Year
  • 1 2019
  • 1 2023

Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail