Deterministic O(1)-Approximation Algorithms to 1-Center Clustering with Outliers

Author Shyam Narayanan



PDF
Thumbnail PDF

File

LIPIcs.APPROX-RANDOM.2018.21.pdf
  • Filesize: 476 kB
  • 19 pages

Document Identifiers

Author Details

Shyam Narayanan
  • Harvard University, Cambridge, Massachusetts, USA

Cite AsGet BibTex

Shyam Narayanan. Deterministic O(1)-Approximation Algorithms to 1-Center Clustering with Outliers. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 116, pp. 21:1-21:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)
https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2018.21

Abstract

The 1-center clustering with outliers problem asks about identifying a prototypical robust statistic that approximates the location of a cluster of points. Given some constant 0 < alpha < 1 and n points such that alpha n of them are in some (unknown) ball of radius r, the goal is to compute a ball of radius O(r) that also contains alpha n points. This problem can be formulated with the points in a normed vector space such as R^d or in a general metric space. The problem has a simple randomized solution: a randomly selected point is a correct solution with constant probability, and its correctness can be verified in linear time. However, the deterministic complexity of this problem was not known. In this paper, for any L^p vector space, we show an O(nd)-time solution with a ball of radius O(r) for a fixed alpha > 1/2, and for any normed vector space, we show an O(nd)-time solution with a ball of radius O(r) when alpha > 1/2 as well as an O(nd log^{(k)}(n))-time solution with a ball of radius O(r) for all alpha > 0, k in N, where log^{(k)}(n) represents the kth iterated logarithm, assuming distance computation and vector space operations take O(d) time. For an arbitrary metric space, we show for any C in N an O(n^{1+1/C})-time solution that finds a ball of radius 2Cr, assuming distance computation between any pair of points takes O(1)-time, and show that for any alpha, C, an O(n^{1+1/C})-time solution that finds a ball of radius ((2C-3)(1-alpha)-1)r cannot exist.

Subject Classification

ACM Subject Classification
  • Theory of computation → Facility location and clustering
  • Theory of computation → Divide and conquer
Keywords
  • Deterministic
  • Approximation Algorithm
  • Cluster
  • Statistic

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Noga Alon, Yossi Matias, and Mario Szegedy. The space complexity of approximating the frequency moments. J. Comput. Syst. Sci., 58(1):137-147, 1999. Google Scholar
  2. Robert S. Boyer and J. Strother Moore. MJRTY - A Fast Majority Vote Algorithm. In Automated Reasoning: Essays in Honor of Woody Bledsoe, pages 105-117. Springer Netherlands, 1991. Google Scholar
  3. Deeparnab Chakrabarty, Prachi Goyal, and Ravishankar Krishnaswamy. The Non-Uniform k-Center Problem. In 43rd International Colloquium on Automata, Languages, and Programming, volume 55 of ICALP '16, pages 67:1-67:15, 2016. Google Scholar
  4. Ching-Lueh Chang. Deterministic sublinear-time approximations for metric 1- median selection. Information Processing Letters, 113:288-292, 2013. Google Scholar
  5. Ching-Lueh Chang. A lower bound for metric 1-median selection. Journal of Computer and System Sciences, 84, 2014. Google Scholar
  6. Ching-Lueh Chang. A deterministic sublinear-time nonadaptive algorithm for metric 1-median selection. Theoretical Computer Science, 602:149-157, 2015. Google Scholar
  7. Ching-Lueh Chang. Metric 1-median selection: Query complexity vs. approximation ratio. Transactions on Computation Theory, 9:20:1-20:23, 2018. Google Scholar
  8. Moses Charikar, Samir Khuller, David M. Mount, and Giri Narasimhan. Algorithms for facility location problems with outliers. In SODA '01, pages 642-651, 2001. Google Scholar
  9. Moses Charikar, Jacob Steinhardt, and Gregory Valiant. Learning from untrusted data. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, pages 47-60, 2017. Google Scholar
  10. Hui Han Chin, Aleksander Madry, Gary L. Miller, and Richard Peng. Runtime guarantees for regression problems. In ITCS, pages 269-282, 2013. Google Scholar
  11. Kenneth L. Clarkson and David P. Woodruff. Numerical linear algebra in the streaming model. In Proceedings of the Forty-first Annual ACM Symposium on Theory of Computing, STOC '09, pages 205-214, 2009. Google Scholar
  12. Michael B. Cohen, Yin Tat Lee, Gary Miller, Jakub Pachocki, and Aaron Sidford. Geometric median in nearly linear time. In Proceedings of the Forty-eighth Annual ACM Symposium on Theory of Computing, STOC '16, pages 9-21, 2016. Google Scholar
  13. I. Diakonikolas, G. Kamath, D. Kane, J. Li, A. Moitra, and A. Stewart. Robust estimators in high dimensions without the computational intractability. In Proceedings of the 57th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2016, 2016. Google Scholar
  14. Peter J. Huber and Elvezio Ronchetti. Robust Statistics. Wiley, 2009. Google Scholar
  15. R. M. McCutchen and S. Khuller. Streaming algorithms for k-center clustering with outliers and with anonymity. In APPROX-RANDOM, pages 165-178, 2008. Google Scholar
  16. Nimrod Megiddo. The weighted euclidean 1-center problem. Mathematics of Operations Research, 8(4):498-504, 1983. Google Scholar
  17. J. Misra and David Gries. Finding repeated elements. Science of Computer Programming, 2:143-152, 1982. Google Scholar
  18. Hamid Zarrabi-Zadeh and Asish Mukhopadhyay. Streaming 1-center with outliers in high dimensions. In Proceedings of the 21st Annual Canadian Conference on Computational Geometry, CCCG '09, pages 83-86, 2009. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail