DROPS

Document

DOI: 10.4230/LIPIcs.MFCS.2024.84

Faster Approximation Schemes for (Constrained) k-Means with Outliers

Authors: Zhen Zhang, Junyu Huang, and Qilong Feng

Published in: LIPIcs, Volume 306, 49th International Symposium on Mathematical Foundations of Computer Science (MFCS 2024)

Abstract

Given a set of n points in ℝ^d and two positive integers k and m, the Euclidean k-means with outliers problem aims to remove at most m points, referred to as outliers, and minimize the k-means cost function for the remaining points. Developing algorithms for this problem remains an active area of research due to its prevalence in applications involving noisy data. In this paper, we give a (1+ε)-approximation algorithm that runs in n²d((k+m)ε^{-1})^O(kε^{-1}) time for the problem. When combined with a coreset construction method, the running time of the algorithm can be improved to be linear in n. For the case where k is a constant, this represents the first polynomial-time approximation scheme for the problem: Existing algorithms with the same approximation guarantee run in polynomial time only when both k and m are constants. Furthermore, our approach generalizes to variants of k-means with outliers incorporating additional constraints on instances, such as those related to capacities and fairness.

Cite as

Zhen Zhang, Junyu Huang, and Qilong Feng. Faster Approximation Schemes for (Constrained) k-Means with Outliers. In 49th International Symposium on Mathematical Foundations of Computer Science (MFCS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 306, pp. 84:1-84:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)

Copy BibTex To Clipboard

@InProceedings{zhang_et_al:LIPIcs.MFCS.2024.84,
  author =	{Zhang, Zhen and Huang, Junyu and Feng, Qilong},
  title =	{{Faster Approximation Schemes for (Constrained) k-Means with Outliers}},
  booktitle =	{49th International Symposium on Mathematical Foundations of Computer Science (MFCS 2024)},
  pages =	{84:1--84:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-335-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{306},
  editor =	{Kr\'{a}lovi\v{c}, Rastislav and Ku\v{c}era, Anton{\'\i}n},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.MFCS.2024.84},
  URN =		{urn:nbn:de:0030-drops-206408},
  doi =		{10.4230/LIPIcs.MFCS.2024.84},
  annote =	{Keywords: Approximation algorithms, clustering}
}

Document

DOI: 10.4230/LIPIcs.ISAAC.2020.5

A Unified Framework of FPT Approximation Algorithms for Clustering Problems

Authors: Qilong Feng, Zhen Zhang, Ziyun Huang, Jinhui Xu, and Jianxin Wang

Published in: LIPIcs, Volume 181, 31st International Symposium on Algorithms and Computation (ISAAC 2020)

Abstract

In this paper, we present a framework for designing FPT approximation algorithms for many k-clustering problems. Our results are based on a new technique for reducing search spaces. A reduced search space is a small subset of the input data that has the guarantee of containing k clients close to the facilities opened in an optimal solution for any clustering problem we consider. We show, somewhat surprisingly, that greedily sampling O(k) clients yields the desired reduced search space, based on which we obtain FPT(k)-time algorithms with improved approximation guarantees for problems such as capacitated clustering, lower-bounded clustering, clustering with service installation costs, fault tolerant clustering, and priority clustering.

Cite as

Qilong Feng, Zhen Zhang, Ziyun Huang, Jinhui Xu, and Jianxin Wang. A Unified Framework of FPT Approximation Algorithms for Clustering Problems. In 31st International Symposium on Algorithms and Computation (ISAAC 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 181, pp. 5:1-5:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)

Copy BibTex To Clipboard

@InProceedings{feng_et_al:LIPIcs.ISAAC.2020.5,
  author =	{Feng, Qilong and Zhang, Zhen and Huang, Ziyun and Xu, Jinhui and Wang, Jianxin},
  title =	{{A Unified Framework of FPT Approximation Algorithms for Clustering Problems}},
  booktitle =	{31st International Symposium on Algorithms and Computation (ISAAC 2020)},
  pages =	{5:1--5:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-173-3},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{181},
  editor =	{Cao, Yixin and Cheng, Siu-Wing and Li, Minming},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2020.5},
  URN =		{urn:nbn:de:0030-drops-133495},
  doi =		{10.4230/LIPIcs.ISAAC.2020.5},
  annote =	{Keywords: clustering, approximation algorithms, fixed-parameter tractability}
}

Document

DOI: 10.4230/LIPIcs.ISAAC.2019.26

Small Candidate Set for Translational Pattern Search

Authors: Ziyun Huang, Qilong Feng, Jianxin Wang, and Jinhui Xu

Published in: LIPIcs, Volume 149, 30th International Symposium on Algorithms and Computation (ISAAC 2019)

Abstract

In this paper, we study the following pattern search problem: Given a pair of point sets A and B in fixed dimensional space R^d, with |B| = n, |A| = m and n >= m, the pattern search problem is to find the translations T’s of A such that each of the identified translations induces a matching between T(A) and a subset B' of B with cost no more than some given threshold, where the cost is defined as the minimum bipartite matching cost of T(A) and B'. We present a novel algorithm to produce a small set of candidate translations for the pattern search problem. For any B' subseteq B with |B'| = |A|, there exists at least one translation T in the candidate set such that the minimum bipartite matching cost between T(A) and B' is no larger than (1+epsilon) times the minimum bipartite matching cost between A and B' under any translation (i.e., the optimal translational matching cost). We also show that there exists an alternative solution to this problem, which constructs a candidate set of size O(n log^2 n) in O(n log^2 n) time with high probability of success. As a by-product of our construction, we obtain a weak epsilon-net for hypercube ranges, which significantly improves the construction time and the size of the candidate set. Our technique can be applied to a number of applications, including the translational pattern matching problem.

Cite as

Ziyun Huang, Qilong Feng, Jianxin Wang, and Jinhui Xu. Small Candidate Set for Translational Pattern Search. In 30th International Symposium on Algorithms and Computation (ISAAC 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 149, pp. 26:1-26:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{huang_et_al:LIPIcs.ISAAC.2019.26,
  author =	{Huang, Ziyun and Feng, Qilong and Wang, Jianxin and Xu, Jinhui},
  title =	{{Small Candidate Set for Translational Pattern Search}},
  booktitle =	{30th International Symposium on Algorithms and Computation (ISAAC 2019)},
  pages =	{26:1--26:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-130-6},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{149},
  editor =	{Lu, Pinyan and Zhang, Guochuan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2019.26},
  URN =		{urn:nbn:de:0030-drops-115222},
  doi =		{10.4230/LIPIcs.ISAAC.2019.26},
  annote =	{Keywords: Bipartite matching, Alignment, Discretization, Approximate algorithm}
}

Document

DOI: 10.4230/LIPIcs.ISAAC.2019.61

Improved Algorithms for Clustering with Outliers

Authors: Qilong Feng, Zhen Zhang, Ziyun Huang, Jinhui Xu, and Jianxin Wang

Published in: LIPIcs, Volume 149, 30th International Symposium on Algorithms and Computation (ISAAC 2019)

Abstract

Clustering is a fundamental problem in unsupervised learning. In many real-world applications, the to-be-clustered data often contains various types of noises and thus needs to be removed from the learning process. To address this issue, we consider in this paper two variants of such clustering problems, called k-median with m outliers and k-means with m outliers. Existing techniques for both problems either incur relatively large approximation ratios or can only efficiently deal with a small number of outliers. In this paper, we present improved solution to each of them for the case where k is a fixed number and m could be quite large. Particularly, we gave the first PTAS for the k-median problem with outliers in Euclidean space R^d for possibly high m and d. Our algorithm runs in O(nd((1/epsilon)(k+m))^(k/epsilon)^O(1)) time, which considerably improves the previous result (with running time O(nd(m+k)^O(m+k) + (1/epsilon)k log n)^O(1))) given by [Feldman and Schulman, SODA 2012]. For the k-means with outliers problem, we introduce a (6+epsilon)-approximation algorithm for general metric space with running time O(n(beta (1/epsilon)(k+m))^k) for some constant beta>1. Our algorithm first uses the k-means++ technique to sample O((1/epsilon)(k+m)) points from input and then select the k centers from them. Compared to the more involving existing techniques, our algorithms are much simpler, i.e., using only random sampling, and achieving better performance ratios.

Cite as

Qilong Feng, Zhen Zhang, Ziyun Huang, Jinhui Xu, and Jianxin Wang. Improved Algorithms for Clustering with Outliers. In 30th International Symposium on Algorithms and Computation (ISAAC 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 149, pp. 61:1-61:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{feng_et_al:LIPIcs.ISAAC.2019.61,
  author =	{Feng, Qilong and Zhang, Zhen and Huang, Ziyun and Xu, Jinhui and Wang, Jianxin},
  title =	{{Improved Algorithms for Clustering with Outliers}},
  booktitle =	{30th International Symposium on Algorithms and Computation (ISAAC 2019)},
  pages =	{61:1--61:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-130-6},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{149},
  editor =	{Lu, Pinyan and Zhang, Guochuan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2019.61},
  URN =		{urn:nbn:de:0030-drops-115573},
  doi =		{10.4230/LIPIcs.ISAAC.2019.61},
  annote =	{Keywords: Clustering with Outliers, Approximation, Random Sampling}
}

Document

DOI: 10.4230/LIPIcs.ISAAC.2018.31

New Algorithms for Edge Induced König-Egerváry Subgraph Based on Gallai-Edmonds Decomposition

Authors: Qilong Feng, Guanlan Tan, Senmin Zhu, Bin Fu, and Jianxin Wang

Published in: LIPIcs, Volume 123, 29th International Symposium on Algorithms and Computation (ISAAC 2018)

Abstract

König-Egerváry graphs form an important graph class which has been studied extensively in graph theory. Much attention has also been paid on König-Egerváry subgraphs and König-Egerváry graph modification problems. In this paper, we focus on one König-Egerváry subgraph problem, called the Maximum Edge Induced König Subgraph problem. By exploiting the classical Gallai-Edmonds decomposition, we establish connections between minimum vertex cover, Gallai-Edmonds decomposition structure, maximum matching, maximum bisection, and König-Egerváry subgraph structure. We obtain a new structural property of König-Egerváry subgraph: every graph G=(V, E) has an edge induced König-Egerváry subgraph with at least 2|E|/3 edges. Based on the new structural property proposed, an approximation algorithm with ratio 10/7 for the Maximum Edge Induced König Subgraph problem is presented, improving the current best ratio of 5/3. To the best of our knowledge, this paper is the first one establishing the connection between Gallai-Edmonds decomposition and König-Egerváry graphs. Using 2|E|/3 as a lower bound, we define the Edge Induced König Subgraph above lower bound problem, and give a kernel of at most 30k edges for the problem.

Cite as

Qilong Feng, Guanlan Tan, Senmin Zhu, Bin Fu, and Jianxin Wang. New Algorithms for Edge Induced König-Egerváry Subgraph Based on Gallai-Edmonds Decomposition. In 29th International Symposium on Algorithms and Computation (ISAAC 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 123, pp. 31:1-31:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)

Copy BibTex To Clipboard

@InProceedings{feng_et_al:LIPIcs.ISAAC.2018.31,
  author =	{Feng, Qilong and Tan, Guanlan and Zhu, Senmin and Fu, Bin and Wang, Jianxin},
  title =	{{New Algorithms for Edge Induced K\"{o}nig-Egerv\'{a}ry Subgraph Based on Gallai-Edmonds Decomposition}},
  booktitle =	{29th International Symposium on Algorithms and Computation (ISAAC 2018)},
  pages =	{31:1--31:12},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-094-1},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{123},
  editor =	{Hsu, Wen-Lian and Lee, Der-Tsai and Liao, Chung-Shou},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2018.31},
  URN =		{urn:nbn:de:0030-drops-99790},
  doi =		{10.4230/LIPIcs.ISAAC.2018.31},
  annote =	{Keywords: K\"{o}nig-Egerv\'{a}ry graph, Gallai-Edmonds decomposition}
}

Document

DOI: 10.4230/LIPIcs.MFCS.2017.65

An Improved FPT Algorithm for the Flip Distance Problem

Authors: Shaohua Li, Qilong Feng, Xiangzhong Meng, and Jianxin Wang

Published in: LIPIcs, Volume 83, 42nd International Symposium on Mathematical Foundations of Computer Science (MFCS 2017)

Abstract

Given a set \cal P of points in the Euclidean plane and two triangulations of \cal P, the flip distance between these two triangulations is the minimum number of flips required to transform one triangulation into the other. The Parameterized Flip Distance problem is to decide if the flip distance between two given triangulations is equal to a given integer k. The previous best FPT algorithm runs in time O^*(k\cdot c^k) (c\leq 2\times 14^11), where each step has fourteen possible choices, and the length of the action sequence is bounded by 11k. By applying the backtracking strategy and analyzing the underlying property of the flip sequence, each step of our algorithm has only five possible choices. Based on an auxiliary graph G, we prove that the length of the action sequence for our algorithm is bounded by 2|G|. As a result, we present an FPT algorithm running in time O^*(k\cdot 32^k).

Cite as

Shaohua Li, Qilong Feng, Xiangzhong Meng, and Jianxin Wang. An Improved FPT Algorithm for the Flip Distance Problem. In 42nd International Symposium on Mathematical Foundations of Computer Science (MFCS 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 83, pp. 65:1-65:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{li_et_al:LIPIcs.MFCS.2017.65,
  author =	{Li, Shaohua and Feng, Qilong and Meng, Xiangzhong and Wang, Jianxin},
  title =	{{An Improved FPT Algorithm for the Flip Distance Problem}},
  booktitle =	{42nd International Symposium on Mathematical Foundations of Computer Science (MFCS 2017)},
  pages =	{65:1--65:13},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-046-0},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{83},
  editor =	{Larsen, Kim G. and Bodlaender, Hans L. and Raskin, Jean-Francois},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.MFCS.2017.65},
  URN =		{urn:nbn:de:0030-drops-81100},
  doi =		{10.4230/LIPIcs.MFCS.2017.65},
  annote =	{Keywords: triangulation, flip distance, FPT algorithm}
}

Search Results

Documents authored by Feng, Qilong

Faster Approximation Schemes for (Constrained) k-Means with Outliers

Abstract

Cite as

A Unified Framework of FPT Approximation Algorithms for Clustering Problems

Abstract

Cite as

Small Candidate Set for Translational Pattern Search

Abstract

Cite as

Improved Algorithms for Clustering with Outliers

Abstract

Cite as

New Algorithms for Edge Induced König-Egerváry Subgraph Based on Gallai-Edmonds Decomposition

Abstract

Cite as

An Improved FPT Algorithm for the Flip Distance Problem

Abstract

Cite as

Thanks for your feedback!

Could not send message