Document

# Small Space Stream Summary for Matroid Center

## File

LIPIcs.APPROX-RANDOM.2019.20.pdf
• Filesize: 0.54 MB
• 22 pages

## Acknowledgements

I thank Ashish Chiplunkar for his contributions, Maryam Negahbani for discussions, and anonymous reviewers for helpful comments.

## Cite As

Sagar Kale. Small Space Stream Summary for Matroid Center. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 145, pp. 20:1-20:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)
https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2019.20

## Abstract

In the matroid center problem, which generalizes the k-center problem, we need to pick a set of centers that is an independent set of a matroid with rank r. We study this problem in streaming, where elements of the ground set arrive in the stream. We first show that any randomized one-pass streaming algorithm that computes a better than Delta-approximation for partition-matroid center must use Omega(r^2) bits of space, where Delta is the aspect ratio of the metric and can be arbitrarily large. This shows a quadratic separation between matroid center and k-center, for which the Doubling algorithm [Charikar et al., 1997] gives an 8-approximation using O(k)-space and one pass. To complement this, we give a one-pass algorithm for matroid center that stores at most O(r^2 log(1/epsilon)/epsilon) points (viz., stream summary) among which a (7+epsilon)-approximate solution exists, which can be found by brute force, or a (17+epsilon)-approximation can be found with an efficient algorithm. If we are allowed a second pass, we can compute a (3+epsilon)-approximation efficiently. We also consider the problem of matroid center with z outliers and give a one-pass algorithm that outputs a set of O((r^2+rz)log(1/epsilon)/epsilon) points that contains a (15+epsilon)-approximate solution. Our techniques extend to knapsack center and knapsack center with z outliers in a straightforward way, and we get algorithms that use space linear in the size of a largest feasible set (as opposed to quadratic space for matroid center).

## Subject Classification

##### ACM Subject Classification
• Theory of computation → Streaming models
• Theory of computation → Facility location and clustering
• Mathematics of computing → Matroids and greedoids
##### Keywords
• Streaming Algorithms
• Matroids
• Clustering

## Metrics

• Access Statistics
• Total Accesses (updated on a weekly basis)
0

## References

1. Kook Jin Ahn, Graham Cormode, Sudipto Guha, Andrew McGregor, and Anthony Wirth. Correlation Clustering in Data Streams. In Proceedings of the 32Nd International Conference on International Conference on Machine Learning - Volume 37, ICML'15, pages 2237-2246, 2015.
2. Ashwinkumar Badanidiyuru Varadaraja. Buyback problem: approximate matroid intersection with cancellation costs. In Proceedings of the 38th international colloquium conference on Automata, languages and programming - Volume Part I, ICALP'11, pages 379-390, 2011.
3. Amit Chakrabarti and Sagar Kale. Submodular maximization meets streaming: matchings, matroids, and more. Mathematical Programming, 154(1):225-247, 2015. URL: https://doi.org/10.1007/s10107-015-0900-7.
4. Deeparnab Chakrabarty, Prachi Goyal, and Ravishankar Krishnaswamy. The Non-Uniform k-Center Problem. In 43rd International Colloquium on Automata, Languages, and Programming, ICALP 2016, pages 67:1-67:15, 2016.
5. Deeparnab Chakrabarty and Maryam Negahbani. Generalized Center Problems with Outliers. In 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018), volume 107, pages 30:1-30:14, 2018.
6. T-H. Hubert Chan, Arnaud Guerqin, and Mauro Sozio. Fully Dynamic k-Center Clustering. In Proceedings of the 2018 World Wide Web Conference, WWW '18, pages 579-587, 2018.
7. Moses Charikar, Chandra Chekuri, Tomás Feder, and Rajeev Motwani. Incremental Clustering and Dynamic Information Retrieval. In Proc. 29th Annual ACM Symposium on the Theory of Computing, STOC '97, pages 626-635, 1997.
8. Moses Charikar, Samir Khuller, David M. Mount, and Giri Narasimhan. Algorithms for Facility Location Problems with Outliers. In Proceedings of the Twelfth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA '01, pages 642-651, 2001.
9. Moses Charikar, Liadan O'Callaghan, and Rina Panigrahy. Better Streaming Algorithms for Clustering Problems. In Proceedings of the Thirty-fifth Annual ACM Symposium on Theory of Computing, STOC '03, pages 30-39. ACM, 2003.
10. Chandra Chekuri, Shalmoli Gupta, and Kent Quanrud. Streaming Algorithms for Submodular Function Maximization. In Proc. 42nd International Colloquium on Automata, Languages and Programming, pages 318-330, 2015.
11. Danny Z. Chen, Jian Li, Hongyu Liang, and Haitao Wang. Matroid and Knapsack Center Problems. Algorithmica, 75(1):27-52, May 2016.
12. Vincent Cohen-Addad, Chris Schwiegelshohn, and Christian Sohler. Diameter and k-Center in Sliding Windows. In 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016), volume 55, pages 19:1-19:12, 2016.
13. Teofilo F. Gonzalez. Clustering to Minimize the Maximum Intercluster Distance. Theor. Comput. Sci., 38:293-306, 1985.
14. S. Guha, A. Meyerson, N. Mishra, R. Motwani, and L. O'Callaghan. Clustering data streams: Theory and practice. IEEE Transactions on Knowledge and Data Engineering, 15(3):515-528, May 2003.
15. Sudipto Guha. Tight Results for Clustering and Summarizing Data Streams. In Proc. 12th International Conference on Database Theory, ICDT '09, pages 268-275, 2009.
16. MohammadTaghi Hajiaghayi, Rohit Khandekar, and Guy Kortsarz. Budgeted Red-blue Median and Its Generalizations. In Proceedings of the 18th Annual European Conference on Algorithms: Part I, ESA'10, pages 314-325. Springer-Verlag, 2010. URL: http://dl.acm.org/citation.cfm?id=1888935.1888972.
17. S. L. Hakimi. Optimum Locations of Switching Centers and the Absolute Centers and Medians of a Graph. Oper. Res., 12(3):450-459, June 1964. URL: https://doi.org/10.1287/opre.12.3.450.
18. S. L. Hakimi. Optimum Distribution of Switching Centers in a Communication Network and Some Related Graph Theoretic Problems. Oper. Res., 13(3):462-475, June 1965. URL: https://doi.org/10.1287/opre.13.3.462.
19. David G. Harris, Thomas Pensyl, Aravind Srinivasan, and Khoa Trinh. A Lottery Model for Center-Type Problems with Outliers. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM, pages 10:1-10:19, 2017. URL: https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2017.10.
20. Behnam Hatami and Hamid Zarrabi-Zadeh. A Streaming Algorithm for 2-Center with Outliers in High Dimensions. Comput. Geom., 60:26-36, 2017.
21. Dorit S. Hochbaum and David B. Shmoys. A Best Possible Heuristic for the k-Center Problem. Math. Oper. Res., 10(2):180-184, May 1985. URL: https://doi.org/10.1287/moor.10.2.180.
22. Dorit S. Hochbaum and David B. Shmoys. A Unified Approach to Approximation Algorithms for Bottleneck Problems. J. ACM, 33(3):533-550, May 1986. URL: https://doi.org/10.1145/5925.5933.
23. Wen-Lian Hsu and George L. Nemhauser. Easy and hard bottleneck location problems. Discrete Applied Mathematics, 1(3):209-215, 1979. URL: https://doi.org/10.1016/0166-218X(79)90044-1.
24. Sang-Sub Kim and Hee-Kap Ahn. An improved data stream algorithm for clustering. Computational Geometry, 48(9):635-645, 2015. URL: https://doi.org/10.1016/j.comgeo.2015.06.003.
25. Ravishankar Krishnaswamy, Amit Kumar, Viswanath Nagarajan, Yogish Sabharwal, and Barna Saha. The Matroid Median Problem. In Proceedings of the Twenty-second Annual ACM-SIAM Symposium on Discrete Algorithms, SODA '11, pages 1117-1130, 2011.
26. Silvio Lattanzi, Stefano Leonardi, Vahab Mirrokni, and Ilya Razenshteyn. Robust Hierarchical k-Center Clustering. In Proceedings of the 2015 Conference on Innovations in Theoretical Computer Science, ITCS '15, pages 211-218, 2015.
27. Gustavo Malkomes, Matt J Kusner, Wenlin Chen, Kilian Q Weinberger, and Benjamin Moseley. Fast Distributed k-Center Clustering with Outliers on Massive Data. In Advances in Neural Information Processing Systems 28, pages 1063-1071. Curran Associates, Inc., 2015. URL: http://papers.nips.cc/paper/5997-fast-distributed-k-center-clustering-with-outliers-on-massive-data.pdf.
28. Richard Matthew McCutchen and Samir Khuller. Streaming Algorithms for k-Center Clustering with Outliers and with Anonymity. In Proc. 11th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems, pages 165-178, 2008.
29. Hamid Zarrabi-Zadeh and Asish Mukhopadhyay. Streaming 1-Center with Outliers in High Dimensions. In Proceedings of the 21st Annual Canadian Conference on Computational Geometry, Vancouver, British Columbia, Canada, August 17-19, 2009, pages 83-86, 2009.