A Combinatorial Approach for Single-cell Variant Detection via Phylogenetic Inference

Authors Mohammadamin Edrisi, Hamim Zafar , Luay Nakhleh

Thumbnail PDF


  • Filesize: 2.03 MB
  • 13 pages

Document Identifiers

Author Details

Mohammadamin Edrisi
  • Department of Computer Science, Rice University, Houston, TX, USA
Hamim Zafar
  • Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
Luay Nakhleh
  • Department of Computer Science, Rice University, Houston, TX, USA

Cite AsGet BibTex

Mohammadamin Edrisi, Hamim Zafar, and Luay Nakhleh. A Combinatorial Approach for Single-cell Variant Detection via Phylogenetic Inference. In 19th International Workshop on Algorithms in Bioinformatics (WABI 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 143, pp. 22:1-22:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Single-cell sequencing provides a powerful approach for elucidating intratumor heterogeneity by resolving cell-to-cell variability. However, it also poses additional challenges including elevated error rates, allelic dropout and non-uniform coverage. A recently introduced single-cell-specific mutation detection algorithm leverages the evolutionary relationship between cells for denoising the data. However, due to its probabilistic nature, this method does not scale well with the number of cells. Here, we develop a novel combinatorial approach for utilizing the genealogical relationship of cells in detecting mutations from noisy single-cell sequencing data. Our method, called scVILP, jointly detects mutations in individual cells and reconstructs a perfect phylogeny among these cells. We employ a novel Integer Linear Program algorithm for deterministically and efficiently solving the joint inference problem. We show that scVILP achieves similar or better accuracy but significantly better runtime over existing methods on simulated data. We also applied scVILP to an empirical human cancer dataset from a high grade serous ovarian cancer patient.

Subject Classification

ACM Subject Classification
  • Applied computing → Computational genomics
  • Mutation calling
  • Single-cell sequencing
  • Integer linear programming
  • Perfect phylogeny


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Niko Beerenwinkel, Roland F Schwarz, Moritz Gerstung, and Florian Markowetz. Cancer evolution: mathematical models and computational inference. Systematic Biology, 64(1):e1-e25, 2014. Google Scholar
  2. Rebecca A Burrell and Charles Swanton. Tumour heterogeneity and the evolution of polyclonal drug resistance. Molecular Oncology, 8(6):1095-1111, 2014. Google Scholar
  3. Markus Chimani, Sven Rahmann, and Sebastian Böcker. Exact ILP solutions for phylogenetic minimum flip problems. In Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology, pages 147-153. ACM, 2010. Google Scholar
  4. Frank B Dean, John R Nelson, Theresa L Giesler, and Roger S Lasken. Rapid amplification of plasmid and phage DNA using phi29 DNA polymerase and multiply-primed rolling circle amplification. Genome Research, 11(6):1095-1099, 2001. Google Scholar
  5. Mark A. DePristo, Eric Banks, Ryan Poplin, Kiran V. Garimella, Jared R. Maguire, Christopher Hartl, Anthony A. Philippakis, Guillermo del Angel, Manuel A. Rivas, Matt Hanna, Aaron McKenna, Tim J. Fennell, Andrew M. Kernytsky, Andrey Y. Sivachenko, Kristian Cibulskis, Stacey B. Gabriel, David Altshuler, and Mark J. Daly. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nature Genetics, 43(5):491-498, 2011. Google Scholar
  6. Amit G. Deshwar, Shankar Vembu, Christina K. Yung, Gun Ho Jang, Lincoln Stein, and Quaid Morris. PhyloWGS: Reconstructing subclonal composition and evolution from whole-genome sequencing of tumors. Genome Biology, 16(1):1-20, 2015. Google Scholar
  7. Xiao Dong, Lei Zhang, Brandon Milholland, Moonsook Lee, Alexander Y Maslov, Tao Wang, and Jan Vijg. Accurate identification of single-nucleotide variants in whole-genome-amplified single cells. Nature Methods, 14(5):491, 2017. Google Scholar
  8. Charles Gawad, Winston Koh, and Stephen R. Quake. Dissecting the clonal origins of childhood acute lymphoblastic leukemia by single-cell genomics. Proceedings of the National Academy of Sciences, 111(50):17947-17952, 2014. Google Scholar
  9. Moritz Gerstung, Christian Beisel, Markus Rechsteiner, Peter Wild, Peter Schraml, Holger Moch, and Niko Beerenwinkel. Reliable detection of subclonal single-nucleotide variants in tumour cell populations. Nature Communications, 3:811, 2012. Google Scholar
  10. Robert J. Gillies, Daniel Verduzco, and Robert A. Gatenby. Evolutionary dynamics of carcinogenesis and why targeted therapy does not work. Nat Rev Cancer, 12(7):487-493, July 2012. Google Scholar
  11. Dan Gusfield, Yelena Frid, and Dan Brown. Integer programming formulations and computations solving phylogenetic and population genetic problems with missing or genotypic data. In International Computing and Combinatorics Conference, pages 51-64. Springer, 2007. Google Scholar
  12. Katharina Jahn, Jack Kuipers, and Niko Beerenwinkel. Tree inference for single-cell data. Genome Biology, 17(1):1-17, 2016. Google Scholar
  13. Jack Kuipers, Katharina Jahn, Benjamin J Raphael, and Niko Beerenwinkel. Single-cell sequencing data reveal widespread recurrence and loss of mutational hits in the life histories of tumors. Genome research, 27(11):1885-1894, 2017. Google Scholar
  14. Marco L Leung, Alexander Davis, Ruli Gao, Anna Casasent, Yong Wang, Emi Sei, Eduardo Vilar, Dipen Maru, Scott Kopetz, and Nicholas E Navin. Single-cell DNA sequencing reveals a late-dissemination model in metastatic colorectal cancer. Genome Research, 27(8):1287-1299, 2017. Google Scholar
  15. Jian Ma, Aakrosh Ratan, Brian J. Raney, Bernard B. Suh, Webb Miller, and David Haussler. The infinite sites model of genome evolution. Proceedings of the National Academy of Sciences, 105(38):14254-14261, 2008. Google Scholar
  16. Salem Malikic, Simone Ciccolella, Farid Rashidi Mehrabadi, Camir Ricketts, Md Khaledur Rahman, Ehsan Haghshenas, Daniel Seidman, Faraz Hach, Iman Hajirasouliha, and S Cenk Sahinalp. PhISCS-a combinatorial approach for sub-perfect tumor phylogeny reconstruction via integrative use of single cell and bulk sequencing data. BioRxiv, page 376996, 2018. Google Scholar
  17. Andrew McPherson, Andrew Roth, Emma Laks, Tehmina Masud, Ali Bashashati, Allen W Zhang, Gavin Ha, Justina Biele, Damian Yap, Adrian Wan, et al. Divergent modes of clonal spread and intraperitoneal mixing in high-grade serous ovarian cancer. Nature Genetics, 48(7):758, 2016. Google Scholar
  18. Lauren M.F. Merlo, John W. Pepper, Brian J. Reid, and Carlo C. Maley. Cancer as an evolutionary and ecological process. Nat Rev Cancer, 6(12):924-935, December 2006. Google Scholar
  19. Nicholas Navin. Cancer genomics: one cell at a time. Genome Biology, 15(8):452-465, 2014. Google Scholar
  20. Nicholas E. Navin. The first five years of single-cell cancer genomics and beyond. Genome Research, 25(10):1499-1507, October 2015. Google Scholar
  21. PC Nowell. The clonal evolution of tumor cell populations. Science, 194(4260):23-28, 1976. Google Scholar
  22. Andrew Roth, Jaswinder Khattra, Damian Yap, Adrian Wan, Emma Laks, Justina Biele, Gavin Ha, Samuel Aparicio, Alexandre Bouchard-Cote, and Sohrab P. Shah. PyClone: statistical inference of clonal population structure in cancer. Nat Meth, 11(4):396-398, April 2014. Google Scholar
  23. Jochen Singer, Jack Kuipers, Katharina Jahn, and Niko Beerenwinkel. Single-cell mutation identification via phylogenetic inference. Nature Communications, 9(1):5144, 2018. Google Scholar
  24. DL Swafford. PAUP*. Phylogenetic analysis using parsimony (* and other methods). Vol. Sinauer Associates, Sunderland, MA, 2002. Google Scholar
  25. Charles Swanton. Intratumor heterogeneity: evolution through space and time. Cancer research, 72(19):4875-4882, 2012. Google Scholar
  26. Lili Wang, Jean Fan, Joshua M Francis, George Georghiou, Sarah Hergert, Shuqiang Li, Rutendo Gambe, Chensheng W Zhou, Chunxiao Yang, Sheng Xiao, et al. Integrated single-cell genetic and transcriptional analysis suggests novel drivers of chronic lymphocytic leukemia. Genome research, 27(8):1300-1311, 2017. Google Scholar
  27. Yong Wang and Nicholas E. Navin. Advances and applications of single-cell sequencing technologies. Molecular Cell, 58(4):598-609, 2015. Google Scholar
  28. Hamim Zafar, Nicholas Navin, Ken Chen, and Luay Nakhleh. SiCloneFit: Bayesian inference of population structure, genotype, and phylogeny of tumor clones from single-cell genome sequencing data. bioRxiv, page 394262, 2018. Google Scholar
  29. Hamim Zafar, Nicholas Navin, Luay Nakhleh, and Ken Chen. Computational approaches for inferring tumor evolution from single-cell genomic data. Current Opinion in Systems Biology, 7:16-25, 2018. Google Scholar
  30. Hamim Zafar, Anthony Tzen, Nicholas Navin, Ken Chen, and Luay Nakhleh. SiFit: inferring tumor trees from single-cell sequencing data under finite-sites models. Genome Biology, 18(1):178, 2017. Google Scholar
  31. Hamim Zafar, Yong Wang, Luay Nakhleh, Nicholas Navin, and Ken Chen. Monovar: single-nucleotide variant detection in single cells. Nature Methods, 13(6):505-507, June 2016. Google Scholar
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail