The Most Parsimonious Reconciliation Problem in the Presence of Incomplete Lineage Sorting and Hybridization Is NP-Hard

Authors Matthew LeMay, Yi-Chieh Wu , Ran Libeskind-Hadas

Thumbnail PDF


  • Filesize: 0.74 MB
  • 10 pages

Document Identifiers

Author Details

Matthew LeMay
  • Department of Mathematics, Harvey Mudd College, Claremont, CA, USA
Yi-Chieh Wu
  • Department of Computer Science, Harvey Mudd College, Claremont, CA, USA
Ran Libeskind-Hadas
  • Department of Computer Science, Harvey Mudd College, Claremont, CA, USA


The authors thank Adam Walker and the anonymous reviewers for valuable comments that helped improve the paper.

Cite AsGet BibTex

Matthew LeMay, Yi-Chieh Wu, and Ran Libeskind-Hadas. The Most Parsimonious Reconciliation Problem in the Presence of Incomplete Lineage Sorting and Hybridization Is NP-Hard. In 21st International Workshop on Algorithms in Bioinformatics (WABI 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 201, pp. 1:1-1:10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


The maximum parsimony phylogenetic reconciliation problem seeks to explain incongruity between a gene phylogeny and a species phylogeny with respect to a set of evolutionary events. While the reconciliation problem is well-studied for species and gene trees subject to events such as duplication, transfer, loss, and deep coalescence, recent work has examined species phylogenies that incorporate hybridization and are thus represented by networks rather than trees. In this paper, we show that the problem of computing a maximum parsimony reconciliation for a gene tree and species network is NP-hard even when only considering deep coalescence. This result suggests that future work on maximum parsimony reconciliation for species networks should explore approximation algorithms and heuristics.

Subject Classification

ACM Subject Classification
  • Applied computing → Computational biology
  • phylogenetics
  • reconciliation
  • deep coalescence
  • hybridization
  • NP-hardness


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Mihaela Baroni, Charles Semple, and Mike Steel. Hybrids in real time. Syst Biol, 55(1):46-56, 2006. URL:
  2. Daniel Bork, Ricson Cheng, Jincheng Wang, Jean Sung, and Ran Libeskind-Hadas. On the computational complexity of the maximum parsimony reconciliation problem in the duplication-loss-coalescence model. Algorithm Mol Biol, 12(6), 2017. URL:
  3. Morgan Carothers, Joseph Gardi, Gianluca Gross, Tatsuki Kuze, Nuo Liu, Fiona Plunkett, Julia Qian, and Yi-Chieh Wu. An integer linear programming solution for the most parsimonious reconciliation problem under the duplication-loss-coalescence model. In Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, BCB '20, New York, NY, USA, 2020. Association for Computing Machinery. URL:
  4. Yao-ban Chan, Vincent Ranwez, and Céline Scornavacca. Inferring incomplete lineage sorting, duplications, transfers and losses with reconciliations. J Theor Biol, 432:1-13, 2017. URL:
  5. R. A. Leo Elworth, Huw A. Ogilvie, Jiafan Zhu, and Luay Nakhleh. Advances in computational methods for phylogenetic networks in the presence of hybridization. In Tandy Warnow, editor, Bioinformatics and Phylogenetics: Seminal Contributions of Bernard Moret, pages 317-360. Springer International Publishing, Cham, 2019. URL:
  6. Ryan A. Folk, Pamela S. Soltis, Douglas E. Soltis, and Robert Guralnick. New prospects in the detection and comparative analysis of hybridization in the tree of life. Am J Bot, 105(3):364-375, 2018. URL:
  7. Morris Goodman, John Czelusniak, G. William Moore, A.E. Romero-Herrera, and Genji Matsuda. Fitting the gene lineage into its species lineage, a parsimony strategy illustrated by cladograms constructed from globin sequences. Syst Zool, 28(2):132-163, 1979. URL:
  8. Paweł Górecki and Jerzy Tiuryn. Dls-trees: A model of evolutionary scenarios. Theoret Comput Sci, 359(1-3):378-399, 2006. URL:
  9. L. Li and M. S. Bansal. An integrated reconciliation framework for domain, gene, and species level evolution. IEEE/ACM Trans Comput Biol Bioinform, 16(1):63-76, 2019. URL:
  10. Lei Li and Mukul S. Bansal. An integer linear programming solution for the domain-gene-species reconciliation problem. In Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, page 386–397, Washington, DC, USA, 2018. Association for Computing Machinery. URL:
  11. R Libeskind-Hadas and M Charleston. On the computational complexity of the reticulate cophylogeny reconstruction problem. J Comput Biol, 16:105-117, 2009. URL:
  12. Wayne P. Maddison. Gene trees in species trees. Syst Biol, 46(3):523-536, 1997. URL:
  13. Y. Ovadia, D. Fielder, C. Conow, and R. Libeskind-Hadas. The cophylogeny reconstruction problem is NP-complete. J Comput Biol, 18(1):59-65, 2011. URL:
  14. Roderic D.M. Page. Maps between trees and cladistic analysis of historical associations among genes,organisms, and areas. Syst Biol, 43(1):58-77, 1994. URL:
  15. P. Pamilo and M. Nei. Relationships between gene trees and species trees. Mol Biol Evol, 5(5):568-583, September 1988. URL:
  16. Roswitha Schmickl, Sarah Marburger, Sian Bray, and Levi Yant. Hybrids and horizontal transfer: introgression allows adaptive allele discovery. J Exp Bot, 68(20):5453-5470, 2017. URL:
  17. Maureen Stolzer, Han Lai, Minli Xu, Deepa Sathaye, Benjamin Vernot, and Dannie Durand. Inferring duplications, losses, transfers and incomplete lineage sorting with nonbinary species trees. Bioinformatics, 28(18):409-415, 2012. URL:
  18. Fumio Tajima. Evolutionary relationship of DNA sequences in finite populations. Genetics, 105(2):437-460, 1983. URL:
  19. Thu-Hien To and Celine Scornavacca. Efficient algorithms for reconciling gene trees and species networks via duplication and loss events. BMC Genomics, 16(10):S6, October 2015. URL:
  20. Ali Tofigh, Michael Hallett, and Jens Lagergren. Simultaneous identification of duplications and lateral gene transfers. IEEE/ACM Trans Comput Biol Bioinform, 8(2):517-535, March 2011. URL:
  21. Nicolas Wieseke, Tom Hartmann, Matthias Bernt, and Martin Middendorf. Cophylogenetic reconciliation with ILP. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 12(6):1227–1235, November 2015. URL:
  22. Taoyang Wu and Louxin Zhang. Structural properties of the reconciliation space and their applications in enumerating nearly-optimal reconciliations between a gene tree and a species tree. BMC Bioinf, 12(Suppl 9):S7-, 2011. URL:
  23. Yun Yu, R. Matthew Barnett, and Luay Nakhleh. Parsimonious inference of hybridization in the presence of incomplete lineage sorting. Syst Biol, 62(5):738-751, 2013. URL:
  24. Yun Yu, Nikola Ristic, and Luay Nakhleh. Fast algorithms and heuristics for phylogenomics under ILS and hybridization. BMC Bioinformatics, 14(15):S6, October 2013. URL:
  25. Yun Yu, Cuong Than, James H. Degnan, and Luay Nakhleh. Coalescent histories on phylogenetic networks and detection of hybridization despite incomplete lineage sorting. Syst Biol, 60(2):138-149, 2011. URL:
  26. Christian M. Zmasek and Sean R. Eddy. A simple algorithm to infer gene duplication and speciation events on a gene tree. Bioinformatics, 17(9):821-828, September 2001. URL: