Approximation Algorithms for Clustering Problems with Lower Bounds and Outliers
We consider clustering problems with non-uniform lower bounds and outliers, and obtain the first approximation guarantees for these problems. We have a set F of facilities with lower bounds {L_i}_{i in F} and a set D of clients located in a common metric space {c(i,j)}_{i,j in F union D}, and bounds k, m. A feasible solution is a pair (S subseteq F, sigma: D -> S union {out}), where sigma specifies the client assignments, such that |S| <=k, |sigma^{-1}(i)| >= L_i for all i in S, and |sigma^{-1}(out)| <= m. In the lower-bounded min-sum-of-radii with outliers P (LBkSRO) problem, the objective is to minimize sum_{i in S} max_{j in sigma^{-1})i)}, and in the lower-bounded k-supplier with outliers (LBkSupO) problem, the objective is to minimize max_{i in S} max_{j in sigma^{-1})i)} c(i,j).
We obtain an approximation factor of 12.365 for LBkSRO, which improves to 3.83 for the non-outlier version (i.e., m = 0). These also constitute the first approximation bounds for the min-sum-of-radii objective when we consider lower bounds and outliers separately. We apply the primal-dual method to the relaxation where we Lagrangify the |S| <= k constraint. The chief technical contribution and novelty of our algorithm is that, departing from the standard paradigm used for such constrained problems, we obtain an O(1)-approximation despite the fact that we do not obtain a Lagrangian-multiplier-preserving algorithm for the Lagrangian relaxation. We believe that our ideas have broader applicability to other clustering problems with outliers as well.
We obtain approximation factors of 5 and 3 respectively for LBkSupO and its non-outlier version. These are the first approximation results for k-supplier with non-uniform lower bounds.
Approximation algorithms
facililty-location problems
primal-dual method
Lagrangian relaxation
k-center problems
minimizing sum of radii
69:1-69:15
Regular Paper
Sara
Ahmadian
Sara Ahmadian
Chaitanya
Swamy
Chaitanya Swamy
10.4230/LIPIcs.ICALP.2016.69
Gagan Aggarwal, Tomás Feder, Krishnaram Kenthapadi, Samir Khuller, Rina Panigrahy, Dilys Thomas, and An Zhu. Achieving anonymity via clustering. ACM Transactions on Algorithms (TALG), 6(3):49, 2010.
Gagan Aggarwal, Tomás Feder, Krishnaram Kenthapadi, Rajeev Motwani, Rina Panigrahy, Dilys Thomas, and An Zhu. Approximation algorithms for k-anonymity. Journal of Privacy Technology (JOPT), 2005.
Sara Ahmadian and Chaitanya Swamy. Improved approximation guarantees for lower-bounded facility location. In Proceedings of the 10th International Workshop on Approximation and Online Algorithms, pages 257-271. Springer, 2012.
Hyung-Chan An, Aditya Bhaskara, Chandra Chekuri, Shalmoli Gupta, Vivek Madan, and Ola Svensson. Centrality of trees for capacitated k-center. Mathematical Programming, 154(1-2):29-53, 2015.
Babak Behsaz and Mohammad R Salavatipour. On minimum sum of radii and diameters clustering. Algorithmica, 73(1):143-165, 2015.
Jarosław Byrka, Thomas Pensyl, Bartosz Rybicki, Aravind Srinivasan, and Khoa Trinh. An improved approximation for k-median, and positive correlation in budgeted optimization. In Proceedings of the 26th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 737-756. SIAM, 2015.
Vasilis Capoyleas, Günter Rote, and Gerhard Woeginger. Geometric clusterings. Journal of Algorithms, 12(2):341-356, 1991.
Moses Charikar and Sudipto Guha. Improved combinatorial algorithms for facility location problems. SIAM Journal on Computing, 34(4):803-824, 2005.
Moses Charikar, Sudipto Guha, Éva Tardos, and David B Shmoys. A constant-factor approximation algorithm for the k-median problem. In Proceedings of the 31st annual ACM Symposium on Theory of Computing, pages 1-10. ACM, 1999.
Moses Charikar, Samir Khuller, David Mount, and Giri Narasimhan. Algorithms for facility location problems with outliers. In Proceedings of the 12th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 642-651. SIAM, 2001.
Moses Charikar and Rina Panigrahy. Clustering to minimize the sum of cluster diameters. In Proceedings of the 33rd Annual ACM Symposium on Theory of Computing, pages 1-10. ACM, 2001.
Ke Chen. A constant factor approximation algorithm for k-median clustering with outliers. In Proceedings of the 19th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 826-835. SIAM, 2008.
Marek Cygan, MohammadTaghi Hajiaghayi, and Samir Khuller. Lp rounding for k-centers with non-uniform hard capacities. In Proceedings of the 53rd Annual IEEE Symposium on Foundations of Computer Science, pages 273-282. IEEE, 2012.
Marek Cygan and Tomasz Kociumaka. Constant factor approximation for capacitated k-center with outliers. In Proc. 31st International Symposium on Theoretical Aspects of Computer Science (STACS 2014), volume 25 of LIPIcs - Leibniz International Proceedings in Informatics, pages 251-262. Schloss Dagstuhl -Leibniz-Zentrum fuer Informatik, 2014. URL: http://dx.doi.org/10.4230/LIPIcs.STACS.2014.251.
http://dx.doi.org/10.4230/LIPIcs.STACS.2014.251
Srinivas R Doddi, Madhav V Marathe, SS Ravi, David Scot Taylor, and Peter Widmayer. Approximation algorithms for clustering to minimize the sum of diameters. In Proceedings of the 11th Scandinavian Workshop on Algorithm Theory, pages 237-250, 2000.
Alina Ene, Sariel Har-Peled, and Benjamin Raichel. Fast clustering with lower bounds: No customer too far, no shop too small. arXiv preprint arXiv:1304.7318, 2013.
Matt Gibson, Gaurav Kanade, Erik Krohn, Imran A Pirwani, and Kasturi Varadarajan. On metric clustering to minimize the sum of radii. Algorithmica, 57(3):484-498, 2010.
Matt Gibson, Gaurav Kanade, Erik Krohn, Imran A Pirwani, and Kasturi Varadarajan. On clustering to minimize the sum of radii. SIAM Journal on Computing, 41(1):47-60, 2012.
Sudipto Guha, Adam Meyerson, and Kamesh Munagala. Hierarchical placement and network design problems. In Proceedings of 41st Annual IEEE Symposium on Foundations of Computer Science, pages 603-612. IEEE, 2000.
Dorit S Hochbaum and David B Shmoys. A best possible heuristic for the k-center problem. Mathematics of Operations Research, 10(2):180-184, 1985.
Dorit S Hochbaum and David B Shmoys. A unified approach to approximation algorithms for bottleneck problems. Journal of the ACM, 33(3):533-550, 1986.
Kamal Jain and Vijay V Vazirani. Approximation algorithms for metric facility location and k-median problems using the primal-dual schema and lagrangian relaxation. Journal of the ACM, 48(2):274-296, 2001.
David R. Karger and Maria Minkoff. Building steiner trees with incomplete global knowledge. In Proceedings of 41st Annual IEEE Symposium on Foundations of Computer Science, pages 613-623. IEEE, 2000.
Samir Khuller and Yoram J Sussmann. The capacitated k-center problem. SIAM Journal on Discrete Mathematics, 13(3):403-418, 2000.
Shi Li and Ola Svensson. Approximating k-median via pseudo-approximation. In Proceedings of the 44th Annual ACM Symposium on Theory of Computing, pages 901-910. ACM, 2013.
Andrew Lim, Fan Wang, and Zhou Xu. A transportation problem with minimum quantity commitment. Transportation Science, 40(1):117-129, 2006.
Pitu B Mirchandani and Richard L Francis. Discrete location theory. Wiley-Interscience, 1990.
Pierangela Samarati. Protecting respondents identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6):1010-1027, 2001.
David B Shmoys. The design and analysis of approximation algorithms. Trends in Optimization: American Mathematical Society Short Course, January 5-6, 2004, Phoeniz, Arizona, 61:85, 2004.
Zoya Svitkina. Lower-bounded facility location. ACM Transactions on Algorithms (TALG), 6(4):69, 2010.
Creative Commons Attribution 3.0 Unported license
https://creativecommons.org/licenses/by/3.0/legalcode