Search Results

Documents authored by Huang, Lingxiao


Document
Space Complexity of Euclidean Clustering

Authors: Xiaoyi Zhu, Yuxiang Tian, Lingxiao Huang, and Zengfeng Huang

Published in: LIPIcs, Volume 293, 40th International Symposium on Computational Geometry (SoCG 2024)


Abstract
The (k, z)-Clustering problem in Euclidean space ℝ^d has been extensively studied. Given the scale of data involved, compression methods for the Euclidean (k, z)-Clustering problem, such as data compression and dimension reduction, have received significant attention in the literature. However, the space complexity of the clustering problem, specifically, the number of bits required to compress the cost function within a multiplicative error ε, remains unclear in existing literature. This paper initiates the study of space complexity for Euclidean (k, z)-Clustering and offers both upper and lower bounds. Our space bounds are nearly tight when k is constant, indicating that storing a coreset, a well-known data compression approach, serves as the optimal compression scheme. Furthermore, our lower bound result for (k, z)-Clustering establishes a tight space bound of Θ(n d) for terminal embedding, where n represents the dataset size. Our technical approach leverages new geometric insights for principal angles and discrepancy methods, which may hold independent interest.

Cite as

Xiaoyi Zhu, Yuxiang Tian, Lingxiao Huang, and Zengfeng Huang. Space Complexity of Euclidean Clustering. In 40th International Symposium on Computational Geometry (SoCG 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 293, pp. 82:1-82:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{zhu_et_al:LIPIcs.SoCG.2024.82,
  author =	{Zhu, Xiaoyi and Tian, Yuxiang and Huang, Lingxiao and Huang, Zengfeng},
  title =	{{Space Complexity of Euclidean Clustering}},
  booktitle =	{40th International Symposium on Computational Geometry (SoCG 2024)},
  pages =	{82:1--82:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-316-4},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{293},
  editor =	{Mulzer, Wolfgang and Phillips, Jeff M.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SoCG.2024.82},
  URN =		{urn:nbn:de:0030-drops-200279},
  doi =		{10.4230/LIPIcs.SoCG.2024.82},
  annote =	{Keywords: Space complexity, Euclidean clustering, coreset, terminal embedding}
}
Document
SVM via Saddle Point Optimization: New Bounds and Distributed Algorithms

Authors: Lingxiao Huang, Yifei Jin, and Jian Li

Published in: LIPIcs, Volume 101, 16th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2018)


Abstract
We study two important SVM variants: hard-margin SVM (for linearly separable cases) and nu-SVM (for linearly non-separable cases). We propose new algorithms from the perspective of saddle point optimization. Our algorithms achieve (1-epsilon)-approximations with running time O~(nd+n sqrt{d / epsilon}) for both variants, where n is the number of points and d is the dimensionality. To the best of our knowledge, the current best algorithm for nu-SVM is based on quadratic programming approach which requires Omega(n^2 d) time in worst case [Joachims, 1998; Platt, 1999]. In the paper, we provide the first nearly linear time algorithm for nu-SVM. The current best algorithm for hard margin SVM achieved by Gilbert algorithm [Gärtner and Jaggi, 2009] requires O(nd / epsilon) time. Our algorithm improves the running time by a factor of sqrt{d}/sqrt{epsilon}. Moreover, our algorithms can be implemented in the distributed settings naturally. We prove that our algorithms require O~(k(d +sqrt{d/epsilon})) communication cost, where k is the number of clients, which almost matches the theoretical lower bound. Numerical experiments support our theory and show that our algorithms converge faster on high dimensional, large and dense data sets, as compared to previous methods.

Cite as

Lingxiao Huang, Yifei Jin, and Jian Li. SVM via Saddle Point Optimization: New Bounds and Distributed Algorithms. In 16th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 101, pp. 25:1-25:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{huang_et_al:LIPIcs.SWAT.2018.25,
  author =	{Huang, Lingxiao and Jin, Yifei and Li, Jian},
  title =	{{SVM via Saddle Point Optimization: New Bounds and Distributed Algorithms}},
  booktitle =	{16th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2018)},
  pages =	{25:1--25:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-068-2},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{101},
  editor =	{Eppstein, David},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SWAT.2018.25},
  URN =		{urn:nbn:de:0030-drops-88515},
  doi =		{10.4230/LIPIcs.SWAT.2018.25},
  annote =	{Keywords: nu-SVM, hard-margin SVM, saddle point optimization, distributed algorithm}
}
Document
epsilon-Kernel Coresets for Stochastic Points

Authors: Lingxiao Huang, Jian Li, Jeff M. Phillips, and Haitao Wang

Published in: LIPIcs, Volume 57, 24th Annual European Symposium on Algorithms (ESA 2016)


Abstract
With the dramatic growth in the number of application domains that generate probabilistic, noisy and uncertain data, there has been an increasing interest in designing algorithms for geometric or combinatorial optimization problems over such data. In this paper, we initiate the study of constructing epsilon-kernel coresets for uncertain points. We consider uncertainty in the existential model where each point's location is fixed but only occurs with a certain probability, and the locational model where each point has a probability distribution describing its location. An epsilon-kernel coreset approximates the width of a point set in any direction. We consider approximating the expected width (an epsilon-EXP-KERNEL), as well as the probability distribution on the width (an (epsilon, tau)-QUANT-KERNEL) for any direction. We show that there exists a set of O(epsilon^{-(d-1)/2}) deterministic points which approximate the expected width under the existential and locational models, and we provide efficient algorithms for constructing such coresets. We show, however, it is not always possible to find a subset of the original uncertain points which provides such an approximation. However, if the existential probability of each point is lower bounded by a constant, an epsilon-EXP-KERNEL is still possible. We also provide efficient algorithms for construct an (epsilon, tau)-QUANT-KERNEL coreset in nearly linear time. Our techniques utilize or connect to several important notions in probability and geometry, such as Kolmogorov distances, VC uniform convergence and Tukey depth, and may be useful in other geometric optimization problem in stochastic settings. Finally, combining with known techniques, we show a few applications to approximating the extent of uncertain functions, maintaining extent measures for stochastic moving points and some shape fitting problems under uncertainty.

Cite as

Lingxiao Huang, Jian Li, Jeff M. Phillips, and Haitao Wang. epsilon-Kernel Coresets for Stochastic Points. In 24th Annual European Symposium on Algorithms (ESA 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 57, pp. 50:1-50:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)


Copy BibTex To Clipboard

@InProceedings{huang_et_al:LIPIcs.ESA.2016.50,
  author =	{Huang, Lingxiao and Li, Jian and Phillips, Jeff M. and Wang, Haitao},
  title =	{{epsilon-Kernel Coresets for Stochastic Points}},
  booktitle =	{24th Annual European Symposium on Algorithms (ESA 2016)},
  pages =	{50:1--50:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-015-6},
  ISSN =	{1868-8969},
  year =	{2016},
  volume =	{57},
  editor =	{Sankowski, Piotr and Zaroliagis, Christos},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2016.50},
  URN =		{urn:nbn:de:0030-drops-63921},
  doi =		{10.4230/LIPIcs.ESA.2016.50},
  annote =	{Keywords: e-kernel, coreset, stochastic point, shape fitting}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail