Document

**Published in:** LIPIcs, Volume 143, 19th International Workshop on Algorithms in Bioinformatics (WABI 2019)

We combine two fundamental, previously studied optimization problems related to the construction of phylogenetic trees called maximum rooted triplets consistency (MAXRTC) and minimally resolved supertree (MINRS) into a new problem, which we call q-maximum rooted triplets consistency (q-MAXRTC). The input to our new problem is a set R of resolved triplets (rooted, binary phylogenetic trees with three leaves each) and the objective is to find a phylogenetic tree with exactly q internal nodes that contains the largest possible number of triplets from R. We first prove that q-MAXRTC is NP-hard even to approximate within a constant ratio for every fixed q >= 2, and then develop various polynomial-time approximation algorithms for different values of q. Next, we show experimentally that representing a phylogenetic tree by one having much fewer nodes typically does not destroy too much triplet branching information. As an extreme example, we show that allowing only nine internal nodes is still sufficient to capture on average 80% of the rooted triplets from some recently published trees, each having between 760 and 3081 internal nodes. Finally, to demonstrate the algorithmic advantage of using trees with few internal nodes, we propose a new algorithm for computing the rooted triplet distance between two phylogenetic trees over a leaf label set of size n that runs in O(q n) time, where q is the number of internal nodes in the smaller tree, and is therefore faster than the currently best algorithms for the problem (with O(n log n) time complexity [SODA 2013, ESA 2017]) whenever q = o(log n).

Jesper Jansson, Konstantinos Mampentzidis, and Sandhya T. P.. Building a Small and Informative Phylogenetic Supertree. In 19th International Workshop on Algorithms in Bioinformatics (WABI 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 143, pp. 1:1-1:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@InProceedings{jansson_et_al:LIPIcs.WABI.2019.1, author = {Jansson, Jesper and Mampentzidis, Konstantinos and T. P., Sandhya}, title = {{Building a Small and Informative Phylogenetic Supertree}}, booktitle = {19th International Workshop on Algorithms in Bioinformatics (WABI 2019)}, pages = {1:1--1:14}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-123-8}, ISSN = {1868-8969}, year = {2019}, volume = {143}, editor = {Huber, Katharina T. and Gusfield, Dan}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2019.1}, URN = {urn:nbn:de:0030-drops-110316}, doi = {10.4230/LIPIcs.WABI.2019.1}, annote = {Keywords: phylogenetic tree, supertree, rooted triplet, approximation algorithm} }

Document

**Published in:** LIPIcs, Volume 87, 25th Annual European Symposium on Algorithms (ESA 2017)

We study the problem of computing the triplet distance between two rooted unordered trees with n labeled leafs. Introduced by Dobson 1975, the triplet distance is the number of leaf triples that induce different topologies in the two trees. The current theoretically best algorithm is an O(nlogn) time algorithm by Brodal et al. [SODA 2013]. Recently Jansson et al. proposed a new algorithm that, while slower in theory, requiring O(n log^3 n) time, in practice it outperforms the theoretically faster O(n log n) algorithm. Both algorithms do not scale to external memory.
We present two cache oblivious algorithms that combine the best of both worlds. The first algorithm is for the case when the two input trees are binary trees and the second a generalized algorithm for two input trees of arbitrary degree. Analyzed in the RAM model, both algorithms require O(n log n) time, and in the cache oblivious model O(n/B log_{2}(n/M)) I/Os. Their relative simplicity and the fact that they scale to external memory makes them achieve the best practical performance. We note that these are the first algorithms that scale to external memory, both in theory and practice, for this problem.

Gerth Stølting Brodal and Konstantinos Mampentzidis. Cache Oblivious Algorithms for Computing the Triplet Distance Between Trees. In 25th Annual European Symposium on Algorithms (ESA 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 87, pp. 21:1-21:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)

Copy BibTex To Clipboard

@InProceedings{stltingbrodal_et_al:LIPIcs.ESA.2017.21, author = {St{\o}lting Brodal, Gerth and Mampentzidis, Konstantinos}, title = {{Cache Oblivious Algorithms for Computing the Triplet Distance Between Trees}}, booktitle = {25th Annual European Symposium on Algorithms (ESA 2017)}, pages = {21:1--21:14}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-049-1}, ISSN = {1868-8969}, year = {2017}, volume = {87}, editor = {Pruhs, Kirk and Sohler, Christian}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2017.21}, URN = {urn:nbn:de:0030-drops-78820}, doi = {10.4230/LIPIcs.ESA.2017.21}, annote = {Keywords: Phylogenetic tree, tree comparison, triplet distance, cache oblivious algorithm} }

X

Feedback for Dagstuhl Publishing

Feedback submitted

Please try again later or send an E-mail