Parsimonious Clone Tree Reconciliation in Cancer

Authors Palash Sashittal , Simone Zaccaria , Mohammed El-Kebir

Thumbnail PDF


  • Filesize: 1.76 MB
  • 21 pages

Document Identifiers

Author Details

Palash Sashittal
  • Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA
Simone Zaccaria
  • Computational Cancer Genomics Research Group, University College London Cancer Institute, London, UK
  • Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
Mohammed El-Kebir
  • Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA
  • Cancer Center at Illinois, University of Illinois at Urbana-Champaign, Urbana, IL, USA


This work was a project in the course CS598MEB (Computational Cancer Genomics, Spring 2021) at UIUC. We thank the students in this course for their valuable feedback. We also thank Ron Zeira for providing the code to compute distances between copy number profiles.

Cite AsGet BibTex

Palash Sashittal, Simone Zaccaria, and Mohammed El-Kebir. Parsimonious Clone Tree Reconciliation in Cancer. In 21st International Workshop on Algorithms in Bioinformatics (WABI 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 201, pp. 9:1-9:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Every tumor is composed of heterogeneous clones, each corresponding to a distinct subpopulation of cells that accumulated different types of somatic mutations, ranging from single-nucleotide variants (SNVs) to copy-number aberrations (CNAs). As the analysis of this intra-tumor heterogeneity has important clinical applications, several computational methods have been introduced to identify clones from DNA sequencing data. However, due to technological and methodological limitations, current analyses are restricted to identifying tumor clones only based on either SNVs or CNAs, preventing a comprehensive characterization of a tumor’s clonal composition. To overcome these challenges, we formulate the identification of clones in terms of both SNVs and CNAs as a reconciliation problem while accounting for uncertainty in the input SNV and CNA proportions. We thus characterize the computational complexity of this problem and we introduce a mixed integer linear programming formulation to solve it exactly. On simulated data, we show that tumor clones can be identified reliably, especially when further taking into account the ancestral relationships that can be inferred from the input SNVs and CNAs. On 49 tumor samples from 10 prostate cancer patients, our reconciliation approach provides a higher resolution view of tumor evolution than previous studies.

Subject Classification

ACM Subject Classification
  • Applied computing → Computational genomics
  • Intra-tumor heterogeneity
  • phylogenetics
  • mixed integer linear programming


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Ravindra K Ahuja, Thomas L Magnanti, James B Orlin, and K Weihe. Network flows: theory, algorithms and applications. ZOR-methods and models of operations research, 41(3):252-254, 1995. Google Scholar
  2. Johnathan Barnett, Hannah Correia, Peter Johnson, Michael Laughlin, and Kathryn Wilson. Darwin meets graph theory on a strange planet: Counting full n-ary trees with labeled leafs. Alabama Journal of Mathematics, 2010. Google Scholar
  3. Rebecca A Burrell, Nicholas McGranahan, Jiri Bartek, and Charles Swanton. The causes and consequences of genetic heterogeneity in cancer evolution. Nature, 501(7467):338-345, 2013. Google Scholar
  4. Giovanni Ciriello, Martin L Miller, Bülent Arman Aksoy, Yasin Senbabaoglu, Nikolaus Schultz, and Chris Sander. Emerging landscape of oncogenic signatures across human cancers. Nature genetics, 45(10):1127-1133, 2013. Google Scholar
  5. Amit G Deshwar, Shankar Vembu, Christina K Yung, Gun Ho Jang, Lincoln Stein, and Quaid Morris. Phylowgs: reconstructing subclonal composition and evolution from whole-genome sequencing of tumors. Genome biology, 16(1):1-20, 2015. Google Scholar
  6. Mohammed El-Kebir, Layla Oesper, Hannah Acheson-Field, and Benjamin J Raphael. Reconstruction of clonal trees and tumor composition from multi-sample sequencing data. Bioinformatics, 31(12):i62-i70, 2015. Google Scholar
  7. Mohammed El-Kebir, Benjamin J Raphael, Ron Shamir, Roded Sharan, Simone Zaccaria, Meirav Zehavi, and Ron Zeira. Copy-number evolution problems: complexity and algorithms. In International Workshop on Algorithms in Bioinformatics, pages 137-149. Springer, 2016. Google Scholar
  8. Mohammed El-Kebir, Benjamin J Raphael, Ron Shamir, Roded Sharan, Simone Zaccaria, Meirav Zehavi, and Ron Zeira. Complexity and algorithms for copy-number evolution problems. Algorithms for Molecular Biology, 12(1):1-11, 2017. Google Scholar
  9. Mohammed El-Kebir, Gryte Satas, Layla Oesper, and Benjamin J. Raphael. Inferring the Mutational History of a Tumor Using Multi-state Perfect Phylogeny Mixtures. Cell Systems, 3(1):43-53, 2016. URL:
  10. D Fernández-Baca. The perfect phylogeny problem. In D Z Zu and X Cheng, editors, Steiner Trees in Industries. Kluwer Acedemic Publishers, 2000. Google Scholar
  11. Andrej Fischer, Ignacio Vázquez-García, Christopher JR Illingworth, and Ville Mustonen. High-definition reconstruction of clonal composition in cancer. Cell reports, 7(5):1740-1752, 2014. Google Scholar
  12. Michael R Garey and David S. Johnson. Complexity results for multiprocessor scheduling under resource constraints. SIAM Journal on Computing, 4(4):397-411, 1975. Google Scholar
  13. Michael R. Garey and David S. Johnson. Computers and intractability. a guide to the theory of np-completeness, 1983. Google Scholar
  14. Charles Gawad, Winston Koh, and Stephen R Quake. Single-cell genome sequencing: current state of the science. Nature Reviews Genetics, 17(3):175, 2016. Google Scholar
  15. Kiya Govek, Camden Sikes, and Layla Oesper. A consensus approach to infer tumor evolutionary histories. In Proceedings of the 2018 Acm international conference on bioinformatics, computational biology, and health informatics, pages 63-72, 2018. Google Scholar
  16. Gunes Gundem, Peter Van Loo, Barbara Kremeyer, Ludmil B Alexandrov, Jose MC Tubio, Elli Papaemmanuil, Daniel S Brewer, Heini ML Kallio, Gunilla Högnäs, Matti Annala, et al. The evolutionary history of lethal metastatic prostate cancer. Nature, 520(7547):353-357, 2015. Google Scholar
  17. Jun Guo, Hanliang Guo, and Zhanyi Wang. Inferring the temporal order of cancer gene mutations in individual tumor samples. PLoS One, 9(2):e89244, 2014. Google Scholar
  18. Mariam Jamal-Hanjani, Gareth A Wilson, Nicholas McGranahan, Nicolai J Birkbak, Thomas BK Watkins, Selvaraju Veeriah, Seema Shafi, Diana H Johnson, Richard Mitter, Rachel Rosenthal, et al. Tracking the evolution of non-small-cell lung cancer. New England Journal of Medicine, 376(22):2109-2121, 2017. Google Scholar
  19. Yuchao Jiang, Yu Qiu, Andy J Minn, and Nancy R Zhang. Assessing intratumor heterogeneity and tracking longitudinal and spatial clonal evolutionary history by next-generation sequencing. Proceedings of the National Academy of Sciences, 113(37):E5528-E5537, 2016. Google Scholar
  20. Sahand Khakabimamaghani, Dujian Ding, Oliver Snow, and Martin Ester. Uncovering the subtype-specific temporal order of cancer pathway dysregulation. PLoS computational biology, 15(11):e1007451, 2019. Google Scholar
  21. Paul L Krapivsky and Sidney Redner. Organization of growing random networks. Physical Review E, 63(6):066123, 2001. Google Scholar
  22. Marco L Leung, Alexander Davis, Ruli Gao, Anna Casasent, Yong Wang, Emi Sei, Eduardo Vilar, Dipen Maru, Scott Kopetz, and Nicholas E Navin. Single-cell dna sequencing reveals a late-dissemination model in metastatic colorectal cancer. Genome research, 27(8):1287-1299, 2017. Google Scholar
  23. Salem Malikic, Andrew W McPherson, Nilgun Donmez, and Cenk S Sahinalp. Clonality inference in multiple tumor samples using phylogeny. Bioinformatics, 31(9):1349-1356, 2015. Google Scholar
  24. Nicholas McGranahan and Charles Swanton. Biological and therapeutic impact of intratumor heterogeneity in cancer evolution. Cancer cell, 27(1):15-26, 2015. Google Scholar
  25. Andrew W McPherson, Andrew Roth, Gavin Ha, Cedric Chauve, Adi Steif, Camila PE de Souza, Peter Eirew, Alexandre Bouchard-Côté, Sam Aparicio, S Cenk Sahinalp, et al. Remixt: clone-specific genomic structure estimation in cancer. Genome biology, 18(1):1-14, 2017. Google Scholar
  26. Faiyaz Notta, Michelle Chan-Seng-Yue, Mathieu Lemire, Yilong Li, Gavin W Wilson, Ashton A Connor, Robert E Denroche, Sheng-Ben Liang, Andrew MK Brown, Jaeseung C Kim, et al. A renewed model of pancreatic cancer evolution based on genomic rearrangement patterns. Nature, 538(7625):378-382, 2016. Google Scholar
  27. Peter C Nowell. The clonal evolution of tumor cell populations. Science, 194(4260):23-28, 1976. Google Scholar
  28. Layla Oesper, Ahmad Mahmoody, and Benjamin J Raphael. Theta: inferring intra-tumor heterogeneity from high-throughput dna sequencing data. Genome biology, 14(7):1-21, 2013. Google Scholar
  29. Victoria Popic, Raheleh Salari, Iman Hajirasouliha, Dorna Kashef-Haghighi, Robert B West, and Serafim Batzoglou. Fast and scalable inference of multi-sample cancer lineages. Genome biology, 16(1):1-17, 2015. Google Scholar
  30. Dikshant Pradhan and Mohammed El-Kebir. On the non-uniqueness of solutions to the perfect phylogeny mixture problem. In RECOMB International conference on Comparative Genomics, pages 277-293. Springer, 2018. Google Scholar
  31. Gryte Satas and Benjamin J Raphael. Tumor phylogeny inference using tree-constrained importance sampling. Bioinformatics, 33(14):i152-i160, 2017. Google Scholar
  32. Gryte Satas, Simone Zaccaria, Geoffrey Mon, and Benjamin J Raphael. Scarlet: Single-cell tumor phylogeny inference with copy-number constrained mutation losses. Cell Systems, 10(4):323-332, 2020. Google Scholar
  33. Roland F Schwarz, Anne Trinh, Botond Sipos, James D Brenton, Nick Goldman, and Florian Markowetz. Phylogenetic quantification of intra-tumour heterogeneity. PLoS Comput Biol, 10(4):e1003535, 2014. Google Scholar
  34. Kathleen Sprouffske, John W Pepper, and Carlo C Maley. Accurate reconstruction of the temporal order of mutations in neoplastic progression. Cancer prevention research, 4(7):1135-1144, 2011. Google Scholar
  35. Francesco Strino, Fabio Parisi, Mariann Micsinai, and Yuval Kluger. Trap: a tree approach for fingerprinting subclonal tumor composition. Nucleic acids research, 41(17):e165-e165, 2013. Google Scholar
  36. Linda K Sundermann, Jeff Wintersinger, Gunnar Rätsch, Jens Stoye, and Quaid Morris. Reconstructing tumor evolutionary histories and clone trees in polynomial-time with submarine. PLoS computational biology, 17(1):e1008400, 2021. Google Scholar
  37. Maxime Tarabichi, Adriana Salcedo, Amit G Deshwar, Máire Ni Leathlobhair, Jeff Wintersinger, David C Wedge, Peter Van Loo, Quaid D Morris, and Paul C Boutros. A practical guide to cancer subclonal reconstruction from dna sequencing. Nature methods, 18(2):144-155, 2021. Google Scholar
  38. Hamid Teimouri and Anatoly B Kolomeisky. Temporal order of mutations influences cancer initiation dynamics. bioRxiv, 2021. Google Scholar
  39. ICGC The, TCGA Pan-Cancer Analysis of Whole, Genomes Consortium, et al. Pan-cancer analysis of whole genomes. Nature, 578(7793):82, 2020. Google Scholar
  40. Thomas BK Watkins, Emilia L Lim, Marina Petkovic, Sergi Elizalde, Nicolai J Birkbak, Gareth A Wilson, David A Moore, Eva Grönroos, Andrew Rowan, Sally M Dewhurst, et al. Pervasive chromosomal instability and karyotype order in tumour evolution. Nature, 587(7832):126-132, 2020. Google Scholar
  41. Taoyang Wu, Vincent Moulton, and Mike Steel. Refining phylogenetic trees given additional data: An algorithm based on parsimony. IEEE/ACM transactions on computational biology and bioinformatics, 6(1):118-125, 2008. Google Scholar
  42. Simone Zaccaria, Mohammed El-Kebir, Gunnar W Klau, and Benjamin J Raphael. The copy-number tree mixture deconvolution problem and applications to multi-sample bulk sequencing tumor data. In International Conference on Research in Computational Molecular Biology, pages 318-335. Springer, 2017. Google Scholar
  43. Simone Zaccaria, Mohammed El-Kebir, Gunnar W Klau, and Benjamin J Raphael. Phylogenetic copy-number factorization of multiple tumor samples. Journal of Computational Biology, 25(7):689-708, 2018. Google Scholar
  44. Simone Zaccaria and Benjamin J Raphael. Accurate quantification of copy-number aberrations and whole-genome duplications in multi-sample tumor sequencing data. Nature communications, 11(1):1-13, 2020. Google Scholar