Document

# k-Center Clustering with Outliers in the Sliding-Window Model

## File

LIPIcs.ESA.2021.13.pdf
• Filesize: 0.72 MB
• 13 pages

## Cite As

Mark de Berg, Morteza Monemizadeh, and Yu Zhong. k-Center Clustering with Outliers in the Sliding-Window Model. In 29th Annual European Symposium on Algorithms (ESA 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 204, pp. 13:1-13:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)
https://doi.org/10.4230/LIPIcs.ESA.2021.13

## Abstract

The k-center problem for a point set P asks for a collection of k congruent balls (that is, balls of equal radius) that together cover all the points in P and whose radius is minimized. The k-center problem with outliers is defined similarly, except that z of the points in P do need not to be covered, for a given parameter z. We study the k-center problem with outliers in data streams in the sliding-window model. In this model we are given a possibly infinite stream P = ⟨ p₁,p₂,p₃,…⟩ of points and a time window of length W, and we want to maintain a small sketch of the set P(t) of points currently in the window such that using the sketch we can approximately solve the problem on P(t). We present the first algorithm for the k-center problem with outliers in the sliding-window model. The algorithm works for the case where the points come from a space of bounded doubling dimension and it maintains a set S(t) such that an optimal solution on S(t) gives a (1+ε)-approximate solution on P(t). The algorithm uses O((kz/ε^d)log σ) storage, where d is the doubling dimension of the underlying space and σ is the spread of the points in the stream. Algorithms providing a (1+ε)-approximation were not even known in the setting without outliers or in the insertion-only setting with outliers. We also present a lower bound showing that any algorithm that provides a (1+ε)-approximation must use Ω((kz/ε)log σ) storage.

## Subject Classification

##### ACM Subject Classification
• Theory of computation → Design and analysis of algorithms
##### Keywords
• Streaming algorithms
• k-center problem
• sliding window
• bounded doubling dimension

## Metrics

• Access Statistics
• Total Accesses (updated on a weekly basis)
0

## References

1. Pankaj K. Agarwal, Sariel Har-Peled, and Hai Yu. Robust shape fitting via peeling and grating coresets. Discret. Comput. Geom., 39(1-3):38-58, 2008. URL: https://doi.org/10.1007/s00454-007-9013-2.
2. Pankaj K. Agarwal and R. Sharathkumar. Streaming algorithms for extent problems in high dimensions. Algorithmica, 72(1):83-98, 2015. URL: https://doi.org/10.1007/s00453-013-9846-4.
3. Matteo Ceccarello, Andrea Pietracaprina, and Geppino Pucci. Solving k-center clustering (with outliers) in mapreduce and streaming, almost as accurately as sequentially. Proc. VLDB Endow., 12(7):766-778, 2019. URL: https://doi.org/10.14778/3317315.3317319.
4. Timothy M. Chan and Vinayak Pathak. Streaming and dynamic algorithms for minimum enclosing balls in high dimensions. Comput. Geom., 47(2):240-247, 2014. URL: https://doi.org/10.1016/j.comgeo.2013.05.007.
5. Timothy M. Chan and Bashir S. Sadjad. Geometric optimization problems over sliding windows. Int. J. Comput. Geom. Appl., 16(2-3):145-158, 2006. URL: https://doi.org/10.1142/S0218195906001975.
6. Moses Charikar, Chandra Chekuri, Tomás Feder, and Rajeev Motwani. Incremental clustering and dynamic information retrieval. SIAM J. Comput., 33(6):1417-1440, 2004. URL: https://doi.org/10.1137/S0097539702418498.
7. Moses Charikar, Samir Khuller, David M Mount, and Giri Narasimhan. Algorithms for facility location problems with outliers. In Proc. 12th Annual ACM-SIAM symposium on Discrete Algorithms (SODA), pages 642-651. Society for Industrial and Applied Mathematics, 2001.
8. Moses Charikar, Liadan O'Callaghan, and Rina Panigrahy. Better streaming algorithms for clustering problems. In Proc. 35th Annual ACM Symposium on Theory of Computing (STOC), pages 30-39, 2003. URL: https://doi.org/10.1145/780542.780548.
9. Vincent Cohen-Addad, Chris Schwiegelshohn, and Christian Sohler. Diameter and k-center in sliding windows. In Proc. 43rd International Colloquium on Automata, Languages, and Programming, (ICALP), volume 55 of LIPIcs, pages 19:1-19:12, 2016. URL: https://doi.org/10.4230/LIPIcs.ICALP.2016.19.
10. Hu Ding, Haikuo Yu, and Zixiu Wang. Greedy strategy works for k-center clustering with outliers and coreset construction. In Proc. 27th Annual European Symposium on Algorithms (ESA), volume 144 of LIPIcs, pages 40:1-40:16, 2019. URL: https://doi.org/10.4230/LIPIcs.ESA.2019.40.
11. Joan Feigenbaum, Sampath Kannan, and Jian Zhang. Computing diameter in the streaming and sliding-window models. Algorithmica, 41(1):25-41, 2005. URL: https://doi.org/10.1007/s00453-004-1105-2.
12. Behnam Hatami and Hamid Zarrabi-Zadeh. A streaming algorithm for 2-center with outliers in high dimensions. Comput. Geom., 60:26-36, 2017. URL: https://doi.org/10.1016/j.comgeo.2016.07.002.
13. Richard Matthew McCutchen and Samir Khuller. Streaming algorithms for k-center clustering with outliers and with anonymity. In Proc. 11th and 12th International Workshop on Approximation, Randomization and Combinatorial Optimization (APPROX and RANDOM), volume 5171 of Lecture Notes in Computer Science, pages 165-178, 2008. URL: https://doi.org/10.1007/978-3-540-85363-3_14.
14. Hamid Zarrabi-Zadeh. Core-preserving algorithms. In Proceedings of the 20th Annual Canadian Conference on Computational Geometry (CCCG), 2008.
15. Hamid Zarrabi-Zadeh and Asish Mukhopadhyay. Streaming 1-center with outliers in high dimensions. In Proc. 21st Annual Canadian Conference on Computational Geometry (CCCG), pages 83-86, 2009. URL: http://cccg.ca/proceedings/2009/cccg09_22.pdf.