Service in Your Neighborhood: Fairness in Center Location

Authors Christopher Jung, Sampath Kannan, Neil Lutz



PDF
Thumbnail PDF

File

LIPIcs.FORC.2020.5.pdf
  • Filesize: 5 MB
  • 15 pages

Document Identifiers

Author Details

Christopher Jung
  • University of Pennsylvania, Philadelphia, PA, USA
Sampath Kannan
  • University of Pennsylvania, Philadelphia, PA, USA
Neil Lutz
  • Iowa State University, Ames, IA, USA

Acknowledgements

We thank Moni Naor and Omer Reingold for helpful discussions, and we thank Anupam Gupta for alerting us to the similarities between our Theorem 2 and Lemma 1 of Chan, Dinitz, and Gupta [Chan et al., 2006].

Cite AsGet BibTex

Christopher Jung, Sampath Kannan, and Neil Lutz. Service in Your Neighborhood: Fairness in Center Location. In 1st Symposium on Foundations of Responsible Computing (FORC 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 156, pp. 5:1-5:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)
https://doi.org/10.4230/LIPIcs.FORC.2020.5

Abstract

When selecting locations for a set of centers, standard clustering algorithms may place unfair burden on some individuals and neighborhoods. We formulate a fairness concept that takes local population densities into account. In particular, given k centers to locate and a population of size n, we define the "neighborhood radius" of an individual i as the minimum radius of a ball centered at i that contains at least n/k individuals. Our objective is to ensure that each individual has a center that is within at most a small constant factor of her neighborhood radius. We present several theoretical results: We show that optimizing this factor is NP-hard; we give an approximation algorithm that guarantees a factor of at most 2 in all metric spaces; and we prove matching lower bounds in some metric spaces. We apply a variant of this algorithm to real-world address data, showing that it is quite different from standard clustering algorithms and outperforms them on our objective function and balances the load between centers more evenly.

Subject Classification

ACM Subject Classification
  • Theory of computation → Theory and algorithms for application domains
Keywords
  • Fairness
  • Clustering
  • Facility Location

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Amineh Amini, Ying Wah Teh, and Hadi Saboohi. On density-based data streams clustering algorithms: A survey. J. Comput. Sci. Technol., 29(1):116-141, 2014. URL: https://doi.org/10.1007/s11390-014-1416-y.
  2. T. Asano, B. Bhattacharya, M. Keil, and F. Yao. Clustering algorithms based on minimum and maximum spanning trees. In Proceedings of the Fourth Annual Symposium on Computational Geometry, SCG '88, pages 252-257, New York, NY, USA, 1988. ACM. URL: https://doi.org/10.1145/73393.73419.
  3. Arturs Backurs, Piotr Indyk, Krzysztof Onak, Baruch Schieber, Ali Vakilian, and Tal Wagner. Scalable fair clustering. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, pages 405-413, 2019. URL: http://proceedings.mlr.press/v97/backurs19a.html.
  4. Jon Louis Bentley. Multidimensional binary search trees used for associative searching. Commun. ACM, 18(9):509-517, September 1975. URL: https://doi.org/10.1145/361002.361007.
  5. Suman Kalyan Bera, Deeparnab Chakrabarty, Nicolas Flores, and Maryam Negahbani. Fair algorithms for clustering. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8-14 December 2019, Vancouver, BC, Canada, pages 4955-4966, 2019. URL: http://papers.nips.cc/paper/8741-fair-algorithms-for-clustering.
  6. Ioana Oriana Bercea, Martin Groß, Samir Khuller, Aounon Kumar, Clemens Rösner, Daniel R. Schmidt, and Melanie Schmidt. On the cost of essentially fair clusterings. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, APPROX/RANDOM 2019, September 20-22, 2019, Massachusetts Institute of Technology, Cambridge, MA, USA, pages 18:1-18:22, 2019. URL: https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2019.18.
  7. T. H. Hubert Chan, Michael Dinitz, and Anupam Gupta. Spanners with slack. In Yossi Azar and Thomas Erlebach, editors, Algorithms - ESA 2006, pages 196-207, Berlin, Heidelberg, 2006. Springer Berlin Heidelberg. Google Scholar
  8. Xingyu Chen, Brandon Fain, Liang Lyu, and Kamesh Munagala. Proportionally fair clustering. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, pages 1032-1041, 2019. URL: http://proceedings.mlr.press/v97/chen19d.html.
  9. Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, and Sergei Vassilvitskii. Fair clustering through fairlets. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4-9 December 2017, Long Beach, CA, USA, pages 5029-5037, 2017. URL: http://papers.nips.cc/paper/7088-fair-clustering-through-fairlets.
  10. A. Chouldechova and A. Roth. The Frontiers of Fairness in Machine Learning. arXiv e-prints, October 2018. URL: http://arxiv.org/abs/1810.08810.
  11. Petros Drineas, Alan M. Frieze, Ravi Kannan, Santosh Vempala, and V. Vinay. Clustering large graphs via the singular value decomposition. Machine Learning, 56(1-3):9-33, 2004. URL: https://doi.org/10.1023/B:MACH.0000033113.59016.96.
  12. Charles Elkan. Using the triangle inequality to accelerate k-means. In Proceedings of the Twentieth International Conference on International Conference on Machine Learning, ICML'03, pages 147-153. AAAI Press, 2003. URL: http://dl.acm.org/citation.cfm?id=3041838.3041857.
  13. Fairfax County GIS. Address points, 2019. URL: https://catalog.data.gov/dataset/address-points-b4b16.
  14. Heather Haddon and Annie Gasparro. Companies and government seek new answers for food deserts. The Wall Street Journal, October 2016. URL: https://www.wsj.com/articles/companies-and-government-seek-new-answers-for-food-deserts-1476670262.
  15. Dorit S. Hochbaum. When are NP-hard location problems easy? Annals OR, 1(3):201-214, 1984. URL: https://doi.org/10.1007/BF01874389.
  16. Anil K. Jain and Richard C. Dubes. Algorithms for clustering data, 1988. Google Scholar
  17. Matthäus Kleindessner, Pranjal Awasthi, and Jamie Morgenstern. Fair k-center clustering for data summarization. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, pages 3448-3457, 2019. URL: http://proceedings.mlr.press/v97/kleindessner19a.html.
  18. Matthäus Kleindessner, Samira Samadi, Pranjal Awasthi, and Jamie Morgenstern. Guarantees for spectral clustering with fairness constraints. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, pages 3458-3467, 2019. URL: http://proceedings.mlr.press/v97/kleindessner19b.html.
  19. Ying Liao, Huan Qi, and Weiqun Li. Load-balanced clustering algorithm with distributed self-organization for wireless sensor networks. IEEE sensors journal, 13(5):1498-1506, 2012. Google Scholar
  20. S. Lloyd. Least squares quantization in pcm. IEEE Trans. Inf. Theor., 28(2):129-137, September 2006. URL: https://doi.org/10.1109/TIT.1982.1056489.
  21. J. MacQueen. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics, pages 281-297, Berkeley, Calif., 1967. University of California Press. URL: https://projecteuclid.org/euclid.bsmsp/1200512992.
  22. Nimrod Megiddo and Kenneth J. Supowit. On the complexity of some common geometric location problems. SIAM J. Comput., 13(1):182-196, 1984. URL: https://doi.org/10.1137/0213014.
  23. Allegheny County / City of Pittsburgh / Western PA Regional Data Center. Allegheny county address points, 2018. URL: https://catalog.data.gov/dataset/allegheny-county-address-points-07dff.
  24. Issi Romem. Getting around, or just getting by? Where people live with fewer cars, 2019. URL: https://www.trulia.com/research/people-per-vehicle-map/.
  25. U.S. Census Bureau. Population, housing units, area, and density: 2010 - county - census tract, 2010 census summary file 1, 2010 Census. URL: https://factfinder.census.gov.
  26. Kelly Virella. Doctors and health workers reflect on rural america’s limited access to care. The New York Times, 2018. URL: https://www.nytimes.com/2018/07/19/reader-center/rural-health-care.html.
  27. Wei-Tung Wang, Yi-Leh Wu, Cheng-Yuan Tang, and Maw-Kae Hor. Adaptive density-based spatial clustering of applications with noise (dbscan) according to data. In 2015 International Conference on Machine Learning and Cybernetics (ICMLC), volume 1, pages 445-451. IEEE, 2015. Google Scholar
  28. David P. Williamson and David B. Shmoys. The Design of Approximation Algorithms. Cambridge University Press, New York, NY, USA, 1st edition, 2011. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail