Coresets for Fuzzy K-Means with Applications

Authors Johannes Blömer, Sascha Brauer, Kathrin Bujna

Johannes Blömer
  • Department of Computer Science, Paderborn University, Paderborn, Germany
Sascha Brauer
  • Department of Computer Science, Paderborn University, Paderborn, Germany
Kathrin Bujna
  • Department of Computer Science, Paderborn University, Paderborn, Germany

Johannes Blömer, Sascha Brauer, and Kathrin Bujna. Coresets for Fuzzy K-Means with Applications. In 29th International Symposium on Algorithms and Computation (ISAAC 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 123, pp. 46:1-46:12, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2018)


The fuzzy K-means problem is a popular generalization of the well-known K-means problem to soft clusterings. We present the first coresets for fuzzy K-means with size linear in the dimension, polynomial in the number of clusters, and poly-logarithmic in the number of points. We show that these coresets can be employed in the computation of a (1+epsilon)-approximation for fuzzy K-means, improving previously presented results. We further show that our coresets can be maintained in an insertion-only streaming setting, where data points arrive one-by-one.

Subject Classification

ACM Subject Classification
  • Theory of computation → Unsupervised learning and clustering
  • clustering
  • fuzzy k-means
  • coresets
  • approximation algorithms
  • streaming


