A New Quartet Tree Heuristic for Hierarchical Clustering

Authors Rudi Cilibrasi, Paul M. B. Vitany



PDF
Thumbnail PDF

File

DagSemProc.06061.4.pdf
  • Filesize: 215 kB
  • 13 pages

Document Identifiers

Author Details

Rudi Cilibrasi
Paul M. B. Vitany

Cite As Get BibTex

Rudi Cilibrasi and Paul M. B. Vitany. A New Quartet Tree Heuristic for Hierarchical Clustering. In Theory of Evolutionary Algorithms. Dagstuhl Seminar Proceedings, Volume 6061, pp. 1-13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2006) https://doi.org/10.4230/DagSemProc.06061.4

Abstract

We present a new quartet heuristic for
hierarchical clustering
from a given distance matrix.
We determine a dendrogram (ternary tree)
by a new quartet
method and a fast heuristic to implement it.
We do not assume that there is a true ternary tree that generated the
distances and which we with to recover as closeley as possible.
Our aim is to model the distance matrix as faithfully as possible
by the dendrogram. Our algorithm is essentially 
randomized hill-climbing, using
parallellized Genetic Programming,  where
undirected trees evolve in a random walk
driven by a prescribed fitness function. 
Our method is capable of handling up to 60--80
objects in a matter of hours, while no existing quartet heuristic
can directly compute a quartet tree of more than about 20--30 objects
without running for years.
The method is implemented and available as public software
at www.complearn.org. We present applications in many areas
like music, literature, bird-flu (H5N1) virus clustering, and automatic
meaning discovery using Google.

Subject Classification

Keywords
  • Genetic programming
  • hierarchical clustering
  • quartet tree method

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail