Search Results

Documents authored by Zhu, Xiaoyi


Document
Space Complexity of Euclidean Clustering

Authors: Xiaoyi Zhu, Yuxiang Tian, Lingxiao Huang, and Zengfeng Huang

Published in: LIPIcs, Volume 293, 40th International Symposium on Computational Geometry (SoCG 2024)


Abstract
The (k, z)-Clustering problem in Euclidean space ℝ^d has been extensively studied. Given the scale of data involved, compression methods for the Euclidean (k, z)-Clustering problem, such as data compression and dimension reduction, have received significant attention in the literature. However, the space complexity of the clustering problem, specifically, the number of bits required to compress the cost function within a multiplicative error ε, remains unclear in existing literature. This paper initiates the study of space complexity for Euclidean (k, z)-Clustering and offers both upper and lower bounds. Our space bounds are nearly tight when k is constant, indicating that storing a coreset, a well-known data compression approach, serves as the optimal compression scheme. Furthermore, our lower bound result for (k, z)-Clustering establishes a tight space bound of Θ(n d) for terminal embedding, where n represents the dataset size. Our technical approach leverages new geometric insights for principal angles and discrepancy methods, which may hold independent interest.

Cite as

Xiaoyi Zhu, Yuxiang Tian, Lingxiao Huang, and Zengfeng Huang. Space Complexity of Euclidean Clustering. In 40th International Symposium on Computational Geometry (SoCG 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 293, pp. 82:1-82:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{zhu_et_al:LIPIcs.SoCG.2024.82,
  author =	{Zhu, Xiaoyi and Tian, Yuxiang and Huang, Lingxiao and Huang, Zengfeng},
  title =	{{Space Complexity of Euclidean Clustering}},
  booktitle =	{40th International Symposium on Computational Geometry (SoCG 2024)},
  pages =	{82:1--82:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-316-4},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{293},
  editor =	{Mulzer, Wolfgang and Phillips, Jeff M.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2024.82},
  URN =		{urn:nbn:de:0030-drops-200279},
  doi =		{10.4230/LIPIcs.SoCG.2024.82},
  annote =	{Keywords: Space complexity, Euclidean clustering, coreset, terminal embedding}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail