Building a Small and Informative Phylogenetic Supertree

Authors Jesper Jansson, Konstantinos Mampentzidis, Sandhya T. P.



PDF
Thumbnail PDF

File

LIPIcs.WABI.2019.1.pdf
  • Filesize: 0.56 MB
  • 14 pages

Document Identifiers

Author Details

Jesper Jansson
  • The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong
Konstantinos Mampentzidis
  • Department of Computer Science, Aarhus University, Aarhus, Denmark
Sandhya T. P.
  • The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong

Cite AsGet BibTex

Jesper Jansson, Konstantinos Mampentzidis, and Sandhya T. P.. Building a Small and Informative Phylogenetic Supertree. In 19th International Workshop on Algorithms in Bioinformatics (WABI 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 143, pp. 1:1-1:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)
https://doi.org/10.4230/LIPIcs.WABI.2019.1

Abstract

We combine two fundamental, previously studied optimization problems related to the construction of phylogenetic trees called maximum rooted triplets consistency (MAXRTC) and minimally resolved supertree (MINRS) into a new problem, which we call q-maximum rooted triplets consistency (q-MAXRTC). The input to our new problem is a set R of resolved triplets (rooted, binary phylogenetic trees with three leaves each) and the objective is to find a phylogenetic tree with exactly q internal nodes that contains the largest possible number of triplets from R. We first prove that q-MAXRTC is NP-hard even to approximate within a constant ratio for every fixed q >= 2, and then develop various polynomial-time approximation algorithms for different values of q. Next, we show experimentally that representing a phylogenetic tree by one having much fewer nodes typically does not destroy too much triplet branching information. As an extreme example, we show that allowing only nine internal nodes is still sufficient to capture on average 80% of the rooted triplets from some recently published trees, each having between 760 and 3081 internal nodes. Finally, to demonstrate the algorithmic advantage of using trees with few internal nodes, we propose a new algorithm for computing the rooted triplet distance between two phylogenetic trees over a leaf label set of size n that runs in O(q n) time, where q is the number of internal nodes in the smaller tree, and is therefore faster than the currently best algorithms for the problem (with O(n log n) time complexity [SODA 2013, ESA 2017]) whenever q = o(log n).

Subject Classification

ACM Subject Classification
  • Mathematics of computing → Trees
Keywords
  • phylogenetic tree
  • supertree
  • rooted triplet
  • approximation algorithm

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. A. V. Aho, Y. Sagiv, T. G. Szymanski, and J. D. Ullman. Inferring a Tree from Lowest Common Ancestors with an Application to the Optimization of Relational Expressions. SIAM Journal on Computing, 10(3):405-421, 1981. Google Scholar
  2. P. Alimonti. New local search approximation techniques for maximum generalized satisfiability problems. Information Processing Letters, 57(3):151-158, 1996. Google Scholar
  3. O. R. P. Bininda-Emonds. The evolution of supertrees. Trends in Ecology & Evolution, 19(6):315-322, 2004. Google Scholar
  4. G. S. Brodal, R. Fagerberg, C. N. S. Pedersen, T. Mailund, and A. Sand. Efficient Algorithms for Computing the Triplet and Quartet Distance Between Trees of Arbitrary Degree. In Proc. SODA 2013, pages 1814-1832, 2013. Google Scholar
  5. G. S. Brodal and K. Mampentzidis. Cache Oblivious Algorithms for Computing the Triplet Distance between Trees. Proc. ESA 2017, pages 21:1-21:14, 2017. Google Scholar
  6. D. Bryant. Building Trees, Hunting for Trees, and Comparing Trees - Theory and Methods in Phylogenetic Analysis. PhD thesis, University of Canterbury, Christchurch, NZ, 1997. Google Scholar
  7. J. Byrka, P. Gawrychowski, K. T. Huber, and S. Kelk. Worst-case optimal approximation algorithms for maximizing triplet consistency within phylogenetic networks. Journal of Discrete Algorithms, 8(1):65-75, 2010. Google Scholar
  8. J. Byrka, S. Guillemot, and J. Jansson. New Results on Optimizing Rooted Triplets Consistency. Discrete Appl. Math., 158(11):1136-1147, 2010. Google Scholar
  9. L. A. Hug et al. A new view of the tree of life. Nature Microbiology, 1, 2016. Google Scholar
  10. L. Ga̧sieniec, J. Jansson, A. Lingas, and A. Östlin. On the Complexity of Constructing Evolutionary Trees. Journal of Combinatorial Optimization, 3(2):183-197, 1999. Google Scholar
  11. Johan Håstad. Some Optimal Inapproximability Results. J. ACM, 48(4):798-859, 2001. Google Scholar
  12. J. Jansson, R. S. Lemence, and A. Lingas. The Complexity of Inferring a Minimally Resolved Phylogenetic Supertree. SIAM Journal on Computing, 41(1):272-291, 2012. Google Scholar
  13. J. Jansson, Joseph H.-K. Ng, K. Sadakane, and W.-K. Sung. Rooted Maximum Agreement Supertrees. Algorithmica, 43(4):293-307, 2005. Google Scholar
  14. J. Jansson, R. Rajaby, and W.-K. Sung. Minimal Phylogenetic Supertrees and Local Consensus Trees. AIMS Medical Science, 5:181, 2018. Google Scholar
  15. V. Kann, S. Khanna, J. Lagergren, and A. Panconesi. On the Hardness of Approximating Max k-Cut and Its Dual. Chicago Journal of Theoretical Computer Science, 1997. Google Scholar
  16. J. M. Lang, A. E. Darling, and J. A. Eisen. Phylogeny of bacterial and archaeal genomes using conserved genes: supertrees and supermatrices. PLoS ONE, 8(4), 2013. Google Scholar
  17. K. J. Locey and J. T. Lennon. Scaling laws predict global microbial diversity. Proceedings of the National Academy of Sciences, 2016. Google Scholar
  18. A. McKenzie and M. Steel. Distributions of cherries for two models of trees. Mathematical Biosciences, 164(1):81-92, 2000. Google Scholar
  19. L. Trevisan. Parallel Approximation Algorithms by Positive Linear Programming. Algorithmica, 21(1):72-88, 1998. Google Scholar
  20. D. P. Williamson and D. B. Shmoys. The Design of Approximation Algorithms, pages 108-109. Cambridge University Press, New York, NY, USA, 1st edition, 2011. Google Scholar
  21. B. Y. Wu. Constructing the Maximum Consensus Tree from Rooted Triples. Journal of Combinatorial Optimization, 8(1):29-39, 2004. Google Scholar
  22. U. Zwick. Approximation Algorithms for Constraint Satisfaction Problems Involving at Most Three Variables Per Constraint. Proc. SODA 1998, pages 201-210, 1998. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail