Large-Scale Distributed Algorithms for Facility Location with Outliers

Authors Tanmay Inamdar, Shreyas Pai, Sriram V. Pemmaraju



PDF
Thumbnail PDF

File

LIPIcs.OPODIS.2018.5.pdf
  • Filesize: 0.49 MB
  • 16 pages

Document Identifiers

Author Details

Tanmay Inamdar
  • Department of Computer Science, The University of Iowa, Iowa, USA
Shreyas Pai
  • Department of Computer Science, The University of Iowa, Iowa, USA
Sriram V. Pemmaraju
  • Department of Computer Science, The University of Iowa, Iowa, USA

Cite As Get BibTex

Tanmay Inamdar, Shreyas Pai, and Sriram V. Pemmaraju. Large-Scale Distributed Algorithms for Facility Location with Outliers. In 22nd International Conference on Principles of Distributed Systems (OPODIS 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 125, pp. 5:1-5:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019) https://doi.org/10.4230/LIPIcs.OPODIS.2018.5

Abstract

This paper presents fast, distributed, O(1)-approximation algorithms for metric facility location problems with outliers in the Congested Clique model, Massively Parallel Computation (MPC) model, and in the k-machine model. The paper considers Robust Facility Location and Facility Location with Penalties, two versions of the facility location problem with outliers proposed by Charikar et al. (SODA 2001). The paper also considers two alternatives for specifying the input: the input metric can be provided explicitly (as an n x n matrix distributed among the machines) or implicitly as the shortest path metric of a given edge-weighted graph. The results in the paper are: 
- Implicit metric: For both problems, O(1)-approximation algorithms running in O(poly(log n)) rounds in the Congested Clique and the MPC model and O(1)-approximation algorithms running in O~(n/k) rounds in the k-machine model. 
- Explicit metric: For both problems, O(1)-approximation algorithms running in O(log log log n) rounds in the Congested Clique and the MPC model and O(1)-approximation algorithms running in O~(n/k) rounds in the k-machine model. 
Our main contribution is to show the existence of Mettu-Plaxton-style O(1)-approximation algorithms for both Facility Location with outlier problems. As shown in our previous work (Berns et al., ICALP 2012, Bandyapadhyay et al., ICDCN 2018) Mettu-Plaxton style algorithms are more easily amenable to being implemented efficiently in distributed and large-scale models of computation.

Subject Classification

ACM Subject Classification
  • Theory of computation → Facility location and clustering
  • Theory of computation → Distributed algorithms
  • Theory of computation → MapReduce algorithms
  • Theory of computation → Graph algorithms analysis
Keywords
  • Distributed Algorithms
  • Clustering with Outliers
  • Metric Facility Location
  • Massively Parallel Computation
  • k-machine model
  • Congested Clique

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Noga Alon, Yossi Matias, and Mario Szegedy. The Space Complexity of Approximating the Frequency Moments. In Proceedings of the Twenty-eighth Annual ACM Symposium on Theory of Computing, STOC '96, pages 20-29, New York, NY, USA, 1996. ACM. URL: http://dx.doi.org/10.1145/237814.237823.
  2. Alexandr Andoni, Aleksandar Nikolov, Krzysztof Onak, and Grigory Yaroslavtsev. Parallel Algorithms for Geometric Graph Problems. In Proceedings of the Forty-sixth Annual ACM Symposium on Theory of Computing, STOC '14, pages 574-583, New York, NY, USA, 2014. ACM. URL: http://dx.doi.org/10.1145/2591796.2591805.
  3. Aaron Archer, Ranjithkumar Rajagopalan, and David B. Shmoys. Lagrangian Relaxation for the k-Median Problem: New Insights and Continuity Properties. In Giuseppe Di Battista and Uri Zwick, editors, Algorithms - ESA 2003: 11th Annual European Symposium, Budapest, Hungary, September 16-19, 2003. Proceedings, pages 31-42, Berlin, Heidelberg, 2003. Springer Berlin Heidelberg. URL: http://dx.doi.org/10.1007/978-3-540-39658-1_6.
  4. Sayan Bandyapadhyay, Tanmay Inamdar, Shreyas Pai, and Sriram V. Pemmaraju. Near-Optimal Clustering in the k-machine model. In Proceedings of the 19th International Conference on Distributed Computing and Networking, ICDCN 2018, Varanasi, India, January 4-7, 2018, pages 15:1-15:10, 2018. URL: http://dx.doi.org/10.1145/3154273.3154317.
  5. Paul Beame, Paraschos Koutris, and Dan Suciu. Communication Steps for Parallel Query Processing. In Proceedings of the 32nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS '13, pages 273-284, New York, NY, USA, 2013. ACM. URL: http://dx.doi.org/10.1145/2463664.2465224.
  6. Ruben Becker, Andreas Karrenbauer, Sebastian Krinninger, and Christoph Lenzen. Near-Optimal Approximate Shortest Paths and Transshipment in Distributed and Streaming Models. In Andréa W. Richa, editor, 31st International Symposium on Distributed Computing (DISC 2017), volume 91 of Leibniz International Proceedings in Informatics (LIPIcs), pages 7:1-7:16, Dagstuhl, Germany, 2017. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik. URL: http://dx.doi.org/10.4230/LIPIcs.DISC.2017.7.
  7. Soheil Behnezhad, Mahsa Derakhshan, and MohammadTaghi Hajiaghayi. Brief Announcement: Semi-MapReduce Meets Congested Clique. CoRR, abs/1802.10297, 2018. URL: http://arxiv.org/abs/1802.10297.
  8. Andrew Berns, James Hegeman, and Sriram V. Pemmaraju. Super-Fast Distributed Algorithms for Metric Facility Location. In Automata, Languages, and Programming - 39th International Colloquium, ICALP 2012, Warwick, UK, July 9-13, 2012, Proceedings, Part II, pages 428-439, 2012. URL: http://dx.doi.org/10.1007/978-3-642-31585-5_39.
  9. Keren Censor-Hillel, Petteri Kaski, Janne H. Korhonen, Christoph Lenzen, Ami Paz, and Jukka Suomela. Algebraic Methods in the Congested Clique. In Proceedings of the 2015 ACM Symposium on Principles of Distributed Computing, PODC '15, pages 143-152, New York, NY, USA, 2015. ACM. URL: http://dx.doi.org/10.1145/2767386.2767414.
  10. Moses Charikar, Samir Khuller, David M. Mount, and Giri Narasimhan. Algorithms for Facility Location Problems with Outliers. In Proceedings of the Twelfth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA '01, pages 642-651, Philadelphia, PA, USA, 2001. Society for Industrial and Applied Mathematics. Google Scholar
  11. Jeffrey Dean and Sanjay Ghemawat. MapReduce: A Flexible Data Processing Tool. Commun. ACM, 53(1):72-77, January 2010. URL: http://dx.doi.org/10.1145/1629175.1629198.
  12. Danny Dolev, Christoph Lenzen, and Shir Peled. "Tri, Tri Again": Finding Triangles and Small Subgraphs in a Distributed Setting. In Proceedings of the 26th International Symposium on Distributed Computing (DISC), pages 195-209, 2012. Google Scholar
  13. Andrew Drucker, Fabian Kuhn, and Rotem Oshman. The communication complexity of distributed task allocation. In Proceedings of the 30st ACM Symposium on Principles of Distributed Computing (PODC), pages 67-76, 2012. Google Scholar
  14. Alina Ene, Sungjin Im, and Benjamin Moseley. Fast Clustering Using MapReduce. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '11, pages 681-689, New York, NY, USA, 2011. ACM. URL: http://dx.doi.org/10.1145/2020408.2020515.
  15. Kiran Garimella, Gianmarco De Francisci Morales, Aristides Gionis, and Mauro Sozio. Scalable Facility Location for Massive Graphs on Pregel-like Systems. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, CIKM '15, pages 273-282, New York, NY, USA, 2015. ACM. URL: http://dx.doi.org/10.1145/2806416.2806508.
  16. Joachim Gehweiler, Christiane Lammersen, and Christian Sohler. A Distributed O(1)-approximation Algorithm for the Uniform Facility Location Problem. In Proceedings of the 18th Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pages 237-243, 2006. URL: http://dx.doi.org/10.1145/1148109.1148152.
  17. Mohsen Ghaffari. Distributed MIS via All-to-All Communication. In Proceedings of the ACM Symposium on Principles of Distributed Computing, PODC 2017, Washington, DC, USA, July 25-27, 2017, pages 141-149, 2017. URL: http://dx.doi.org/10.1145/3087801.3087830.
  18. Mohsen Ghaffari, Themis Gouleakis, Christian Konrad, Slobodan Mitrovic, and Ronitt Rubinfeld. Improved Massively Parallel Computation Algorithms for MIS, Matching, and Vertex Cover. In Proceedings of the 2018 ACM Symposium on Principles of Distributed Computing, PODC 2018, Egham, United Kingdom, July 23-27, 2018, pages 129-138, 2018. URL: http://dx.doi.org/10.1145/3212734.3212743.
  19. Michael T. Goodrich, Nodari Sitchinava, and Qin Zhang. Sorting, Searching, and Simulation in the Mapreduce Framework. In Proceedings of the 22nd International Conference on Algorithms and Computation, ISAAC'11, pages 374-383, Berlin, Heidelberg, 2011. Springer-Verlag. URL: http://dx.doi.org/10.1007/978-3-642-25591-5_39.
  20. James W. Hegeman and Sriram V. Pemmaraju. Lessons from the Congested Clique applied to MapReduce. Theor. Comput. Sci., 608:268-281, 2015. URL: http://dx.doi.org/10.1016/j.tcs.2015.09.029.
  21. James W. Hegeman and Sriram V. Pemmaraju. Sub-logarithmic distributed algorithms for metric facility location. Distributed Computing, 28(5):351-374, 2015. URL: http://dx.doi.org/10.1007/s00446-015-0243-x.
  22. James W. Hegeman, Sriram V. Pemmaraju, and Vivek Sardeshmukh. Near-Constant-Time Distributed Algorithms on a Congested Clique. CoRR, abs/1408.2071, 2014. URL: http://arxiv.org/abs/1408.2071.
  23. Stephan Holzer and Nathan Pinsker. Approximation of Distances and Shortest Paths in the Broadcast Congest Clique. arXiv preprint, arXiv:1412.3445, 2014. Google Scholar
  24. Tanmay Inamdar, Shreyas Pai, and Sriram V. Pemmaraju. Large-Scale Distributed Algorithms for Facility Location with Outliers. CoRR, abs/1811.06494, November 2018. URL: http://arxiv.org/abs/1811.06494.
  25. Kamal Jain and Vijay V. Vazirani. Approximation Algorithms for Metric Facility Location and k-Median Problems Using the Primal-dual Schema and Lagrangian Relaxation. J. ACM, 48(2):274-296, March 2001. URL: http://dx.doi.org/10.1145/375827.375845.
  26. Tomasz Jurdziński and Krzysztof Nowicki. MST in O(1) Rounds of Congested Clique. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA '18, pages 2620-2632, Philadelphia, PA, USA, 2018. Society for Industrial and Applied Mathematics. Google Scholar
  27. Howard Karloff, Siddharth Suri, and Sergei Vassilvitskii. A Model of Computation for MapReduce. In Proceedings of the Twenty-first Annual ACM-SIAM Symposium on Discrete Algorithms, SODA '10, pages 938-948, Philadelphia, PA, USA, 2010. Society for Industrial and Applied Mathematics. Google Scholar
  28. Hartmut Klauck, Danupon Nanongkai, Gopal Pandurangan, and Peter Robinson. Distributed Computation of Large-scale Graph Problems. In Proceedings of the Twenty-sixth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA '15, pages 391-410, Philadelphia, PA, USA, 2015. Society for Industrial and Applied Mathematics. Google Scholar
  29. Christoph Lenzen. Optimal Deterministic Routing and Sorting on the Congested Clique. In Proceedings of the 31st ACM Symposium on Principles of Distributed Computing (PODC), pages 42-50, 2013. URL: http://dx.doi.org/10.1145/2484239.2501983.
  30. Shi Li. A 1.488 Approximation Algorithm for the Uncapacitated Facility Location Problem. In Proceedings of the 38th International Conference on Automata, Languages and Programming - Volume Part II, ICALP'11, pages 77-88, Berlin, Heidelberg, 2011. Springer-Verlag. Google Scholar
  31. Zvi Lotker, Boaz Patt-Shamir, Elan Pavlov, and David Peleg. Minimum-Weight Spanning Tree Construction in O(log log n) Communication Rounds. SIAM Journal on Computing, 35(1):120-131, 2005. Google Scholar
  32. Grzegorz Malewicz, Matthew H. Austern, Aart J.C Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. Pregel: A System for Large-scale Graph Processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, SIGMOD '10, pages 135-146, New York, NY, USA, 2010. ACM. URL: http://dx.doi.org/10.1145/1807167.1807184.
  33. Ramgopal R. Mettu and C. Greg Plaxton. The Online Median Problem. SIAM J. Comput., 32(3):816-832, March 2003. URL: http://dx.doi.org/10.1137/S0097539701383443.
  34. Danupon Nanongkai. Distributed approximation algorithms for weighted shortest paths. In Proceedings of the 46th ACM Symposium on Theory of Computing (STOC), pages 565-573, 2014. Google Scholar
  35. Gopal Pandurangan, Peter Robinson, and Michele Scquizzato. Fast Distributed Algorithms for Connectivity and MST in Large Graphs. In Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA '16, pages 429-438, New York, NY, USA, 2016. ACM. URL: http://dx.doi.org/10.1145/2935764.2935785.
  36. Gopal Pandurangan, Peter Robinson, and Michele Scquizzato. On the Distributed Complexity of Large-Scale Graph Computations. In Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures, SPAA 2018, Vienna, Austria, July 16-18, 2018, pages 405-414, 2018. URL: http://dx.doi.org/10.1145/3210377.3210409.
  37. Boaz Patt-Shamir and Marat Teplitsky. The Round Complexity of Distributed Sorting. In Proceedings of the 30th Annual ACM Symposium on Principles of Distributed Computing (PODC), pages 249-256, 2011. URL: http://dx.doi.org/10.1145/1993806.1993851.
  38. Jonathan A. Silva, Elaine R. Faria, Rodrigo C. Barros, Eduardo R. Hruschka, André C. P. L. F. de Carvalho, and João Gama. Data Stream Clustering: A Survey. ACM Comput. Surv., 46(1):13:1-13:31, July 2013. URL: http://dx.doi.org/10.1145/2522968.2522981.
  39. Mikkel Thorup. Quick k-Median, k-Center, and Facility Location for Sparse Graphs. SIAM Journal on Computing, 34(2):405-432, 2005. URL: http://dx.doi.org/10.1137/S0097539701388884.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail