Building Clusters with Lower-Bounded Sizes

Authors Faisal Abu-Khzam, Cristina Bazgan, Katrin Casel, Henning Fernau

Thumbnail PDF


  • Filesize: 499 kB
  • 13 pages

Document Identifiers

Author Details

Faisal Abu-Khzam
Cristina Bazgan
Katrin Casel
Henning Fernau

Cite AsGet BibTex

Faisal Abu-Khzam, Cristina Bazgan, Katrin Casel, and Henning Fernau. Building Clusters with Lower-Bounded Sizes. In 27th International Symposium on Algorithms and Computation (ISAAC 2016). Leibniz International Proceedings in Informatics (LIPIcs), Volume 64, pp. 4:1-4:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)


Classical clustering problems search for a partition of objects into a fixed number of clusters. In many scenarios however the number of clusters is not known or necessarily fixed. Further, clusters are sometimes only considered to be of significance if they have a certain size. We discuss clustering into sets of minimum cardinality k without a fixed number of sets and present a general model for these types of problems. This general framework allows the comparison of different measures to assess the quality of a clustering. We specifically consider nine quality-measures and classify the complexity of the resulting problems with respect to k. Further, we derive some polynomial-time solvable cases for k = 2 with connections to matching-type problems which, among other graph problems, then are used to compute approximations for larger values of k.
  • Clustering
  • Approximation Algorithms
  • Complexity
  • Matching


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. G. Aggarwal, R. Panigrahy, T. Feder, D. Thomas, K. Kenthapadi, S. Khuller, and An Zhu. Achieving anonymity via clustering. ACM Transactions on Algorithms, 6(3), 2010. Google Scholar
  2. E. Anshelevich and A. Karagiozova. Terminal Backup, 3D Matching, and Covering Cubic Graphs. SIAM J. Comput., 40(3):678-708, 2011. Google Scholar
  3. A. Armon. On min-max r-gatherings. Theoretical Computer Science, 412(7):573-582, 2011. Google Scholar
  4. J.-W. Byun, A. Kamra, E. Bertino, and N. Li. Efficient k-anonymization using clustering techniques. In R. Kotagiri, P. R. Krishna, M. Mohania, and E. Nantajeewarawat, editors, Advances in Databases: Concepts, Systems and Applications, volume 4443 of LNCS, pages 188-200. Springer, 2007. Google Scholar
  5. G. Cornuéjols, D. Hartvigsen, and W. Pulleyblank. Packing subgraphs in a graph. Operations Research Letters, 1(4):139-143, 1982. Google Scholar
  6. J. Domingo-Ferrer and J. M. Mateo-Sanz. Practical Data-Oriented Microaggregation for Statistical Disclosure Control. IEEE Transactions on Knowledge and Data Engineering, 14(1):189-201, 2002. Google Scholar
  7. J. Domingo-Ferrer and F. Sebé. Optimal Multivariate 2-Microaggregation for Microdata Protection: A 2-Approximation. In J. Domingo-Ferrer and L. Franconi, editors, Privacy in Statistical Databases, PSD'06, volume 4302 of LNCS, pages 129-138. Springer, 2006. Google Scholar
  8. J. Edmonds and E. L. Johnson. Matching, euler tours and the chinese postman. Mathematical Programming, 5:88–124, 1973. Google Scholar
  9. F. Ergün, R. Kumar, and R. Rubinfeld. Fast approximate pcps. In Proceedings of the Thirty-First Annual ACM Symposium on Theory of Computing, May 1-4, 1999, Atlanta, Georgia, USA, pages 41-50, 1999. Google Scholar
  10. M. Goemans and D. Williamson. A general approximation technique for constrained forest problems. SIAM J. Comput., 24(2):296-317, 1995. Google Scholar
  11. S. Guha, A. Meyerson, and K. Munagala. Hierarchical placement and network design problems. In In Proceedings of the 41th Annual IEEE Symposium on Foundations of Computer Science, FOCS'00, pages 603-612. IEEE Computer Society, 2000. Google Scholar
  12. M. Laszlo and S. Mukherjee. Approximation Bounds for Minimum Information Loss Microaggregation. IEEE Transactions on Knowledge and Data Engineering, 21(11):1643-1647, 2009. Google Scholar
  13. C. H. Papadimitriou and M. Yannakakis. Optimization, approximation, and complexity classes. Journal of Computer and System Sciences, 43:425-440, 1991. Google Scholar
  14. P. Samarati. Protecting respondents' identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6):1010-1027, November 2001. Google Scholar
  15. A. Schrijver. Combinatorial Optimization. Springer, 2003. Google Scholar
  16. A. Shalita and U. Zwick. Efficient algorithms for the 2-gathering problem. ACM Transactions on Algorithms, 6(2), 2010. Google Scholar
  17. K. Stokes. On computational anonymity. In Privacy in Statistical Databases - UNESCO Chair in Data Privacy, International Conference, PSD 2012, Palermo, Italy, September 26-28, 2012. Proceedings, pages 336-347, 2012. Google Scholar
  18. C. Tovey. A Simplified NP-complete Satisfiability Problem. Discrete Applied Mathematics, 8(1):85-89, 1984. Google Scholar
  19. D. Xu, E. Anshelevich, and M. Chiang. On survivable access network design: Complexity and algorithms. In INFOCOM 2008. 27th IEEE International Conference on Computer Communications, Joint Conference of the IEEE Computer and Communications Societies, 13-18 April 2008, Phoenix, AZ, USA, pages 186-190, 2008. Google Scholar
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail