Hierarchical Clusterings of Unweighted Graphs

Høgemo, Svein; Paul, Christophe; Telle, Jan Arne

doi:10.4230/LIPIcs.MFCS.2020.47

File

Author Details

Svein Høgemo

Department of Informatics, University of Bergen, Norway

Christophe Paul

LIRMM, University of Montpellier, CNRS, France

Jan Arne Telle

Department of Informatics, University of Bergen, Norway

Cite As Get BibTex

Svein Høgemo, Christophe Paul, and Jan Arne Telle. Hierarchical Clusterings of Unweighted Graphs. In 45th International Symposium on Mathematical Foundations of Computer Science (MFCS 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 170, pp. 47:1-47:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020) https://doi.org/10.4230/LIPIcs.MFCS.2020.47

Abstract

We study the complexity of finding an optimal hierarchical clustering of an unweighted similarity graph under the recently introduced Dasgupta objective function. We introduce a proof technique, called the normalization procedure, that takes any such clustering of a graph G and iteratively improves it until a desired target clustering of G is reached. We use this technique to show both a negative and a positive complexity result. Firstly, we show that in general the problem is NP-complete. Secondly, we consider min-well-behaved graphs, which are graphs H having the property that for any k the graph H^{(k)} being the join of k copies of H has an optimal hierarchical clustering that splits each copy of H in the same optimal way. To optimally cluster such a graph H^{(k)} we thus only need to optimally cluster the smaller graph H. Co-bipartite graphs are min-well-behaved, but otherwise they seem to be scarce. We use the normalization procedure to show that also the cycle on 6 vertices is min-well-behaved.

Subject Classification

ACM Subject Classification

Theory of computation → Design and analysis of algorithms

Keywords

Hierarchical Clustering

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

N. Bansal, A. Blum, and S. Chawla. Correlation clustering. Machine Learning, 56(1-3):89-113, 2004.
P. Buneman. The recovery of trees from measures of dissimilarity. Mathematics in the Archaeological and Historical Sciences, pages 387-395, 1971.
S. Chakrabarti, M. Ester, U. Fayyad, J. Gehrke, J. Han, S. Morishita, G. Piatetsky-Shapiro, and W. Wang. Data mining curriculum: A proposal (version 1.0). Technical report, Intensive Working Group of ACM SIGKDD, 2006.
M. Charikar and V. Chatziafratis. Approximate hierarchical clustering via sparsest cut and spreading metrics. In Annual ACM-SIAM symposium on Discrete algorithms (SODA), pages 841-854, 2017.
V. Cohen-Addad, V. Kanade, F. Mallmann-Trenn, and C. Mathieu. Hierarchical clustering: Objective functions and algorithms. Journal of ACM, 66(4):26:1-26-42, 2019.
S. Dasgupta. A cost function for similarity-based hierarchical clustering. In Annual ACM symposium on Theory of Computing (STOC), pages 118-127, 2016.
S. Dasgupta. Hardness of hierarchical clustering optimization. Private communication, 2019.
R. Diestel. Graph theory. Springer-Verlag, 2005.
J. Hartigan. Clustering algorithms. John Wiley and Sons, 1975.
T. Hastie, R. Tibshirani, and J. Friedman. The elements of statistical learning: data mining, inference, and prediction. Springer Series in Statistics. Springer, second edition, 2009.
K. Koutroumbas and S. Theodoridis. Pattern recognition. Academic Press, fourth edition, 2009.
R. Sokal and P. Sneath. Numerical taxonomy. W.H. Freeman, 1963.

Hierarchical Clusterings of Unweighted Graphs

Authors Svein Høgemo, Christophe Paul, Jan Arne Telle

File

Document Identifiers

Author Details

Cite As Get BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

Hierarchical Clusterings of Unweighted Graphs

Authors Svein Høgemo, Christophe Paul, Jan Arne Telle

File

Document Identifiers

Author Details

Cite As Get BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

Related Versions

References

Thanks for your feedback!

Could not send message