Phyolin: Identifying a Linear Perfect Phylogeny in Single-Cell DNA Sequencing Data of Tumors

Authors Leah L. Weber, Mohammed El-Kebir

Thumbnail PDF


  • Filesize: 1.08 MB
  • 14 pages

Document Identifiers

Author Details

Leah L. Weber
  • Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL
Mohammed El-Kebir
  • Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL


This work was a project in the course CS598MEB (Computational Cancer Genomics, Spring 2020) at UIUC. We thank the students in this course for their valuable feedback.

Cite AsGet BibTex

Leah L. Weber and Mohammed El-Kebir. Phyolin: Identifying a Linear Perfect Phylogeny in Single-Cell DNA Sequencing Data of Tumors. In 20th International Workshop on Algorithms in Bioinformatics (WABI 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 172, pp. 5:1-5:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Cancer arises from an evolutionary process where somatic mutations occur and eventually give rise to clonal expansions. Modeling this evolutionary process as a phylogeny is useful for treatment decision-making as well as understanding evolutionary patterns across patients and cancer types. However, cancer phylogeny inference from single-cell DNA sequencing data of tumors is challenging due to limitations with sequencing technology and the complexity of the resulting problem. Therefore, as a first step some value might be obtained from correctly classifying the evolutionary process as either linear or branched. The biological implications of these two high-level patterns are different and understanding what cancer types and which patients have each of these trajectories could provide useful insight for both clinicians and researchers. Here, we introduce the Linear Perfect Phylogeny Flipping Problem as a means of testing a null model that the tree topology is linear and show that it is NP-hard. We develop Phyolin and, through both in silico experiments and real data application, show that it is an accurate, easy to use and a reasonably fast method for classifying an evolutionary trajectory as linear or branched.

Subject Classification

ACM Subject Classification
  • Applied computing → Molecular evolution
  • Constraint programming
  • intra-tumor heterogeneity
  • combinatorial optimization


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Erfan Sadeqi Azer, Mohammad Haghir Ebrahimabadi, Salem Malikić, Roni Khardon, and S Cenk Sahinalp. Tumor Phylogeny Topology Inference via Deep Learning. bioRxiv, 2020. URL:
  2. Duhong Chen, Oliver Eulenstein, David Fernandez-Baca, and Michael Sanderson. Minimum-flip supertrees: complexity and algorithms. IEEE/ACM transactions on computational biology and bioinformatics, 3(2):165-173, 2006. Google Scholar
  3. Alexander Davis, Ruli Gao, and Nicholas Navin. Tumor evolution: Linear, branching, neutral or punctuated? Biochimica et Biophysica Acta (BBA)-Reviews on Cancer, 1867(2):151-161, 2017. Google Scholar
  4. Amit G Deshwar, Shankar Vembu, Christina K Yung, Gun Ho Jang, Lincoln Stein, and Quaid Morris. PhyloWGS: reconstructing subclonal composition and evolution from whole-genome sequencing of tumors. Genome biology, 16(1):35, 2015. Google Scholar
  5. Mohammed El-Kebir. SPhyR: tumor phylogeny estimation from single-cell sequencing data under loss and error. Bioinformatics, 34(17):i671-i679, 2018. Google Scholar
  6. Mohammed El-Kebir, Layla Oesper, Hannah Acheson-Field, and Benjamin J Raphael. Reconstruction of clonal trees and tumor composition from multi-sample sequencing data. Bioinformatics, 31(12):i62-i70, 2015. Google Scholar
  7. Mohammed El-Kebir, Gryte Satas, Layla Oesper, and Benjamin J Raphael. Inferring the mutational history of a tumor using multi-state perfect phylogeny mixtures. Cell systems, 3(1):43-53, 2016. Google Scholar
  8. Yusi Fu, Chunmei Li, Sijia Lu, Wenxiong Zhou, Fuchou Tang, X Sunney Xie, and Yanyi Huang. Uniform and accurate single-cell sequencing based on emulsion whole-genome amplification. Proceedings of the National Academy of Sciences, 112(38):11923-11928, 2015. Google Scholar
  9. Yusi Fu, Chunmei Li, Sijia Lu, Wenxiong Zhou, Fuchou Tang, X Sunney Xie, and Yanyi Huang. Uniform and accurate single-cell sequencing based on emulsion whole-genome amplification. Proceedings of the National Academy of Sciences of the United States of America, 112(38):11923-11928, September 2015. Google Scholar
  10. Charles Gawad, Winston Koh, and Stephen R Quake. Dissecting the clonal origins of childhood acute lymphoblastic leukemia by single-cell genomics. Proceedings of the National Academy of Sciences, 111(50):17947-17952, 2014. Google Scholar
  11. Dan Gusfield. Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, 1997. URL:
  12. Dan Gusfield. ReCombinatorics: the algorithmics of ancestral recombination graphs and explicit phylogenetic networks. MIT press, 2014. Google Scholar
  13. Katharina Jahn, Jack Kuipers, and Niko Beerenwinkel. Tree inference for single-cell data. Genome biology, 17(1):86, 2016. Google Scholar
  14. Jack Kuipers, Katharina Jahn, Benjamin J Raphael, and Niko Beerenwinkel. Single-cell sequencing data reveal widespread recurrence and loss of mutational hits in the life histories of tumors. Genome research, 27(11):1885-1894, 2017. Google Scholar
  15. Salem Malikic, Katharina Jahn, Jack Kuipers, S Cenk Sahinalp, and Niko Beerenwinkel. Integrative inference of subclonal tumour evolution from single-cell and bulk sequencing data. Nature communications, 10(1):1-12, 2019. Google Scholar
  16. Salem Malikic, Farid Rashidi Mehrabadi, Simone Ciccolella, Md Khaledur Rahman, Camir Ricketts, Ehsan Haghshenas, Daniel Seidman, Faraz Hach, Iman Hajirasouliha, and S Cenk Sahinalp. PhISCS: a combinatorial approach for subperfect tumor phylogeny reconstruction via integrative use of single-cell and bulk sequencing data. Genome research, 29(11):1860-1877, 2019. Google Scholar
  17. Kiyomi Morita, Feng Wang, Katharina Jahn, Jack Kuipers, Yuanqing Yan, Jairo Matthews, Latasha Little, Curtis Gumbs, Shujuan Chen, Jianhua Zhang, et al. Clonal evolution of acute myeloid leukemia revealed by high-throughput single-cell genomics. bioRxiv, 2020. Google Scholar
  18. Peter C Nowell. The clonal evolution of tumor cell populations. Science, 194(4260):23-28, 1976. Google Scholar
  19. Yuanyuan Qi, Dikshant Pradhan, and Mohammed El-Kebir. Implications of non-uniqueness in phylogenetic deconvolution of bulk DNA samples of tumors. Algorithms for Molecular Biology, 14(1):19, 2019. Google Scholar
  20. Edith M Ross and Florian Markowetz. OncoNEM: inferring tumor evolution from single-cell sequencing data. Genome biology, 17(1):1-14, 2016. Google Scholar
  21. Stuart Russell and Peter Norvig. Artificial Intelligence: A Modern Approach Third Edition. Pearson, 2010. Google Scholar
  22. Anna Schuh, Jennifer Becq, Sean Humphray, Adrian Alexa, Adam Burns, Ruth Clifford, Stephan M Feller, Russell Grocock, Shirley Henderson, Irina Khrebtukova, et al. Monitoring chronic lymphocytic leukemia progression by whole genome sequencing reveals heterogeneous clonal evolution patterns. Blood, The Journal of the American Society of Hematology, 120(20):4191-4196, 2012. Google Scholar
  23. Leah Weber, Nuraini Aguse, Nicholas Chia, and Mohammed El-Kebir. PhyDOSE: Design of follow-up single-cell sequencing experiments of tumors. BioRxiv, 2020. Google Scholar
  24. Mihalis Yannakakis. Computing the minimum fill-in is NP-complete. SIAM Journal on Algebraic Discrete Methods, 2(1):77-79, 1981. Google Scholar
  25. Hamim Zafar, Nicholas Navin, Ken Chen, and Luay Nakhleh. SiCloneFit: Bayesian inference of population structure, genotype, and phylogeny of tumor clones from single-cell genome sequencing data. Genome research, 29(11):1847-1859, 2019. Google Scholar
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail