5 Search Results for "Schwiegelshohn, Chris"


Document
An Empirical Evaluation of k-Means Coresets

Authors: Chris Schwiegelshohn and Omar Ali Sheikh-Omar

Published in: LIPIcs, Volume 244, 30th Annual European Symposium on Algorithms (ESA 2022)


Abstract
Coresets are among the most popular paradigms for summarizing data. In particular, there exist many high performance coresets for clustering problems such as k-means in both theory and practice. Curiously, there exists no work on comparing the quality of available k-means coresets. In this paper we perform such an evaluation. There currently is no algorithm known to measure the distortion of a candidate coreset. We provide some evidence as to why this might be computationally difficult. To complement this, we propose a benchmark for which we argue that computing coresets is challenging and which also allows us an easy (heuristic) evaluation of coresets. Using this benchmark and real-world data sets, we conduct an exhaustive evaluation of the most commonly used coreset algorithms from theory and practice.

Cite as

Chris Schwiegelshohn and Omar Ali Sheikh-Omar. An Empirical Evaluation of k-Means Coresets. In 30th Annual European Symposium on Algorithms (ESA 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 244, pp. 84:1-84:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{schwiegelshohn_et_al:LIPIcs.ESA.2022.84,
  author =	{Schwiegelshohn, Chris and Sheikh-Omar, Omar Ali},
  title =	{{An Empirical Evaluation of k-Means Coresets}},
  booktitle =	{30th Annual European Symposium on Algorithms (ESA 2022)},
  pages =	{84:1--84:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-247-1},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{244},
  editor =	{Chechik, Shiri and Navarro, Gonzalo and Rotenberg, Eva and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2022.84},
  URN =		{urn:nbn:de:0030-drops-170225},
  doi =		{10.4230/LIPIcs.ESA.2022.84},
  annote =	{Keywords: coresets, k-means coresets, evaluation, benchmark}
}
Document
On Finding the Jaccard Center

Authors: Marc Bury and Chris Schwiegelshohn

Published in: LIPIcs, Volume 80, 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017)


Abstract
We initiate the study of finding the Jaccard center of a given collection N of sets. For two sets X,Y, the Jaccard index is defined as |X\cap Y|/|X\cup Y| and the corresponding distance is 1-|X\cap Y|/|X\cup Y|. The Jaccard center is a set C minimizing the maximum distance to any set of N. We show that the problem is NP-hard to solve exactly, and that it admits a PTAS while no FPTAS can exist unless P = NP. Furthermore, we show that the problem is fixed parameter tractable in the maximum Hamming norm between Jaccard center and any input set. Our algorithms are based on a compression technique similar in spirit to coresets for the Euclidean 1-center problem. In addition, we also show that, contrary to the previously studied median problem by Chierichetti et al. (SODA 2010), the continuous version of the Jaccard center problem admits a simple polynomial time algorithm.

Cite as

Marc Bury and Chris Schwiegelshohn. On Finding the Jaccard Center. In 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 80, pp. 23:1-23:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{bury_et_al:LIPIcs.ICALP.2017.23,
  author =	{Bury, Marc and Schwiegelshohn, Chris},
  title =	{{On Finding the Jaccard Center}},
  booktitle =	{44th International Colloquium on Automata, Languages, and Programming (ICALP 2017)},
  pages =	{23:1--23:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-041-5},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{80},
  editor =	{Chatzigiannakis, Ioannis and Indyk, Piotr and Kuhn, Fabian and Muscholl, Anca},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2017.23},
  URN =		{urn:nbn:de:0030-drops-73769},
  doi =		{10.4230/LIPIcs.ICALP.2017.23},
  annote =	{Keywords: Clustering, 1-Center, Jaccard}
}
Document
Planar Matching in Streams Revisited

Authors: Andrew McGregor and Sofya Vorotnikova

Published in: LIPIcs, Volume 60, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2016)


Abstract
We present data stream algorithms for estimating the size or weight of the maximum matching in low arboricity graphs. A large body of work has focused on improving the constant approximation factor for general graphs when the data stream algorithm is permitted O(n polylog n) space where n is the number of nodes. This space is necessary if the algorithm must return the matching. Recently, Esfandiari et al. (SODA 2015) showed that it was possible to estimate the maximum cardinality of a matching in a planar graph up to a factor of 24+epsilon using O(epsilon^{-2} n^{2/3} polylog n) space. We first present an algorithm (with a simple analysis) that improves this to a factor 5+epsilon using the same space. We also improve upon the previous results for other graphs with bounded arboricity. We then present a factor 12.5 approximation for matching in planar graphs that can be implemented using O(log n) space in the adjacency list data stream model where the stream is a concatenation of the adjacency lists of the graph. The main idea behind our results is finding "local" fractional matchings, i.e., fractional matchings where the value of any edge e is solely determined by the edges sharing an endpoint with e. Our work also improves upon the results for the dynamic data stream model where the stream consists of a sequence of edges being inserted and deleted from the graph. We also extend our results to weighted graphs, improving over the bounds given by Bury and Schwiegelshohn (ESA 2015), via a reduction to the unweighted problem that increases the approximation by at most a factor of two.

Cite as

Andrew McGregor and Sofya Vorotnikova. Planar Matching in Streams Revisited. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 60, pp. 17:1-17:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)


Copy BibTex To Clipboard

@InProceedings{mcgregor_et_al:LIPIcs.APPROX-RANDOM.2016.17,
  author =	{McGregor, Andrew and Vorotnikova, Sofya},
  title =	{{Planar Matching in Streams Revisited}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2016)},
  pages =	{17:1--17:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-018-7},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{60},
  editor =	{Jansen, Klaus and Mathieu, Claire and Rolim, Jos\'{e} D. P. and Umans, Chris},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.APPROX-RANDOM.2016.17},
  URN =		{urn:nbn:de:0030-drops-66407},
  doi =		{10.4230/LIPIcs.APPROX-RANDOM.2016.17},
  annote =	{Keywords: data streams, planar graphs, arboricity, matchings}
}
Document
Diameter and k-Center in Sliding Windows

Authors: Vincent Cohen-Addad, Chris Schwiegelshohn, and Christian Sohler

Published in: LIPIcs, Volume 55, 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016)


Abstract
In this paper we develop streaming algorithms for the diameter problem and the k-center clustering problem in the sliding window model. In this model we are interested in maintaining a solution for the N most recent points of the stream. In the diameter problem we would like to maintain two points whose distance approximates the diameter of the point set in the window. Our algorithm computes a (3 + epsilon)-approximation and uses O(1/epsilon*ln(alpha)) memory cells, where alpha is the ratio of the largest and smallest distance and is assumed to be known in advance. We also prove that under reasonable assumptions obtaining a (3 - epsilon)-approximation requires Omega(N1/3) space. For the k-center problem, where the goal is to find k centers that minimize the maximum distance of a point to its nearest center, we obtain a (6 + epsilon)-approximation using O(k/epsilon*ln(alpha)) memory cells and a (4 + epsilon)-approximation for the special case k = 2. We also prove that any algorithm for the 2-center problem that achieves an approximation ratio of less than 4 requires Omega(N^{1/3}) space.

Cite as

Vincent Cohen-Addad, Chris Schwiegelshohn, and Christian Sohler. Diameter and k-Center in Sliding Windows. In 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 55, pp. 19:1-19:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)


Copy BibTex To Clipboard

@InProceedings{cohenaddad_et_al:LIPIcs.ICALP.2016.19,
  author =	{Cohen-Addad, Vincent and Schwiegelshohn, Chris and Sohler, Christian},
  title =	{{Diameter and k-Center in Sliding Windows}},
  booktitle =	{43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016)},
  pages =	{19:1--19:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-013-2},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{55},
  editor =	{Chatzigiannakis, Ioannis and Mitzenmacher, Michael and Rabani, Yuval and Sangiorgi, Davide},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2016.19},
  URN =		{urn:nbn:de:0030-drops-63401},
  doi =		{10.4230/LIPIcs.ICALP.2016.19},
  annote =	{Keywords: Streaming, k-Center, Diameter, Sliding Windows}
}
Document
The Power of Migration for Online Slack Scheduling

Authors: Chris Schwiegelshohn and Uwe Schwiegelshohn

Published in: LIPIcs, Volume 57, 24th Annual European Symposium on Algorithms (ESA 2016)


Abstract
We investigate the power of migration in online scheduling for parallel identical machines. Our objective is to maximize the total processing time of accepted jobs. Once we decide to accept a job, we have to complete it before its deadline d that satisfies d >= (1+epsilon)p + r, where p is the processing time, r the submission time and the slack epsilon > 0 a system parameter. Typically, the hard case arises for small slack epsilon << 1, i.e. for near-tight deadlines. Without migration, a greedy acceptance policy is known to be an optimal deterministic online algorithm with a competitive factor of (1+epsilon)/epsilon (DasGupta and Palis, APPROX 2000). Our first contribution is to show that migrations do not improve the competitive ratio of the greedy acceptance policy, i.e. the competitive ratio remains (1+epsilon)/epsilon for any number of machines. Our main contribution is a deterministic online algorithm with almost tight competitive ratio on any number of machines. For a single machine, the competitive factor matches the optimal bound of (1+epsilon)/epsilon of the greedy acceptance policy. The competitive ratio improves with an increasing number of machines. It approaches (1+epsilon) ln((1+epsilon)/epsilon) as the number of machines converges to infinity. This is an exponential improvement over the greedy acceptance policy for small epsilon. Moreover, we show a matching lower bound on the competitive ratio for deterministic algorithms on any number of machines.

Cite as

Chris Schwiegelshohn and Uwe Schwiegelshohn. The Power of Migration for Online Slack Scheduling. In 24th Annual European Symposium on Algorithms (ESA 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 57, pp. 75:1-75:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)


Copy BibTex To Clipboard

@InProceedings{schwiegelshohn_et_al:LIPIcs.ESA.2016.75,
  author =	{Schwiegelshohn, Chris and Schwiegelshohn, Uwe},
  title =	{{The Power of Migration for Online Slack Scheduling}},
  booktitle =	{24th Annual European Symposium on Algorithms (ESA 2016)},
  pages =	{75:1--75:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-015-6},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{57},
  editor =	{Sankowski, Piotr and Zaroliagis, Christos},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2016.75},
  URN =		{urn:nbn:de:0030-drops-64162},
  doi =		{10.4230/LIPIcs.ESA.2016.75},
  annote =	{Keywords: Online scheduling, deadlines, preemption with migration, competitive analysis}
}
  • Refine by Author
  • 4 Schwiegelshohn, Chris
  • 1 Bury, Marc
  • 1 Cohen-Addad, Vincent
  • 1 McGregor, Andrew
  • 1 Schwiegelshohn, Uwe
  • Show More...

  • Refine by Classification
  • 1 Information systems → Clustering
  • 1 Theory of computation → Data compression

  • Refine by Keyword
  • 1 1-Center
  • 1 Clustering
  • 1 Diameter
  • 1 Jaccard
  • 1 Online scheduling
  • Show More...

  • Refine by Type
  • 5 document

  • Refine by Publication Year
  • 3 2016
  • 1 2017
  • 1 2022

Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail