The Bourque Distances for Mutation Trees of Cancers

Authors Katharina Jahn, Niko Beerenwinkel, Louxin Zhang

Thumbnail PDF


  • Filesize: 3.19 MB
  • 22 pages

Document Identifiers

Author Details

Katharina Jahn
  • Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
  • SIB Swiss Institute of Bioinformatics, Basel, Switzerland
Niko Beerenwinkel
  • Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
  • SIB Swiss Institute of Bioinformatics, Basel, Switzerland
Louxin Zhang
  • Department of Mathematics and Computational Biology Program, National University of Singapore, Singapore

Cite AsGet BibTex

Katharina Jahn, Niko Beerenwinkel, and Louxin Zhang. The Bourque Distances for Mutation Trees of Cancers. In 20th International Workshop on Algorithms in Bioinformatics (WABI 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 172, pp. 14:1-14:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Mutation trees are rooted trees of arbitrary node degree in which each node is labeled with a mutation set. These trees, also referred to as clonal trees, are used in computational oncology to represent the mutational history of tumours. Classical tree metrics such as the popular Robinson - Foulds distance are of limited use for the comparison of mutation trees. One reason is that mutation trees inferred with different methods or for different patients often contain different sets of mutation labels. Here, we generalize the Robinson - Foulds distance into a set of distance metrics called Bourque distances for comparing mutation trees. A connection between the Robinson - Foulds distance and the nearest neighbor interchange distance is also presented.

Subject Classification

ACM Subject Classification
  • Applied computing → Bioinformatics
  • mutation trees
  • clonal trees
  • tree distance
  • phylogenetic trees
  • tree metric
  • Robinson - Foulds distance
  • Bourque distance


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Giulia Bernardini, Paola Bonizzoni, and Paweł Gawrychowski. On two measures of distance between fully-labelled trees. arXiv preprint arXiv:2002.05600, 2020. Google Scholar
  2. Damian Bogdanowicz and Krzysztof Giaro. Matching split distance for unrooted binary phylogenetic trees. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 9(1):150-160, 2011. Google Scholar
  3. Michel Bourque. Arbes de Steiner et réseaux dont certains sommets sont à localisation variable. PhD thesis, Thèse (Ph. D.: Informatique)-Université de Montréal, 1978. Google Scholar
  4. Samuel Briand, Christophe Dessimoz, Nadia El-Mabrouk, Manuel Lafond, and Gabriela Lobinska. A generalized robinson-foulds distance for labeled trees. In Proceedings of APBC, 2020. Google Scholar
  5. Simone Ciccolella, Giulia Bernardini, Luca Denti, Paol Bonizzoni, Marco Previtali, and Gianluca Della Vedova. Triplet-based similarity score for fully multi-labeled trees with poly-occurring labels. bioRxiv, 2020. Google Scholar
  6. Simone Ciccolella, Mauricio Soto Gomez, Murray Patterson, Gianluca Della Vedova, Iman Hajirasouliha, and Paola Bonizzoni. Inferring cancer progression from single cell sequencing while allowing loss of mutations. bioRxiv, page 268243, 2018. Google Scholar
  7. Douglas E Critchlow, Dennis K Pearl, and Chunlin Qian. The triples distance for rooted bifurcating phylogenetic trees. Systematic Biology, 45(3):323-334, 1996. Google Scholar
  8. William HE Day. Optimal algorithms for comparing trees with labeled leaves. Journal of Classification, 2(1):7-28, 1985. Google Scholar
  9. Amit G Deshwar, Shankar Vembu, Christina K Yung, Gun Ho Jang, Lincoln Stein, and Quaid Morris. Phylowgs: reconstructing subclonal composition and evolution from whole-genome sequencing of tumors. Genome Biology, 16(1):1-20, 2015. Google Scholar
  10. Zach DiNardo, Kiran Tomlinson, Anna Ritz, and Layla Oesper. Distance measures for tumor evolutionary trees. Bioinformatics, 36(7):2090-2097, 2020. Google Scholar
  11. Jesse Eaton, Jingyi Wang, and Russell Schwartz. Deconvolution and phylogeny inference of structural variations in tumor genomic samples. Bioinformatics, 34(13):i357-i365, 2018. Google Scholar
  12. Mohammed El-Kebir. Oncolib: Library for tumor heterogeneity. GitHub repository, 2018. Google Scholar
  13. Mohammed El-Kebir. Sphyr: tumor phylogeny estimation from single-cell sequencing data under loss and error. Bioinformatics, 34(17):i671-i679, 2018. Google Scholar
  14. Mohammed El-Kebir, Layla Oesper, Hannah Acheson-Field, and Benjamin J Raphael. Reconstruction of clonal trees and tumor composition from multi-sample sequencing data. Bioinformatics, 31(12):i62-i70, 2015. Google Scholar
  15. Joseph Felsenstein and Joseph Felenstein. Inferring phylogenies, volume 2. Sinauer associates Sunderland, MA, 2004. Google Scholar
  16. Charles Gawad, Winston Koh, and Stephen R Quake. Dissecting the clonal origins of childhood acute lymphoblastic leukemia by single-cell genomics. Proceedings of the National Academy of Sciences, 111(50):17947-17952, 2014. Google Scholar
  17. Morris Goodman, John Czelusniak, G William Moore, Alejo E Romero-Herrera, and Genji Matsuda. Fitting the gene lineage into its species lineage, a parsimony strategy illustrated by cladograms constructed from globin sequences. Systematic Biology, 28(2):132-163, 1979. Google Scholar
  18. Kiya Govek, Camden Sikes, and Layla Oesper. A consensus approach to infer tumor evolutionary histories. In Proceedings of the 2018 Acm international conference on bioinformatics, computational biology, and health informatics, pages 63-72, 2018. Google Scholar
  19. Katharina Jahn, Jack Kuipers, and Niko Beerenwinkel. Tree inference for single-cell data. Genome biology, 17(1):1-17, 2016. Google Scholar
  20. Nikolai Karpov, Salem Malikic, Md Khaledur Rahman, and S Cenk Sahinalp. A multi-labeled tree dissimilarity measure for comparing “clonal trees” of tumor progression. Algorithms for Molecular Biology, 14(1):17, 2019. Google Scholar
  21. Michelle Kendall and Caroline Colijn. Mapping phylogenetic trees to reveal distinct patterns of evolution. Molecular Biology and Evolution, 33(10):2735-2743, 2016. Google Scholar
  22. Jack Kuipers, Katharina Jahn, Benjamin J Raphael, and Niko Beerenwinkel. Single-cell sequencing data reveal widespread recurrence and loss of mutational hits in the life histories of tumors. Genome Research, 27(11):1885-1894, 2017. Google Scholar
  23. Shu-Yun Le, Ruth Nussinov, and Jacob V Maizel. Tree graphs of rna secondary structures and their comparisons. Computers and Biomedical Research, 22(5):461-473, 1989. Google Scholar
  24. Ming Li, John Tromp, and Louxin Zhang. On the nearest neighbour interchange distance between evolutionary trees. Journal of Theoretical Biology, 182(4):463-467, 1996. Google Scholar
  25. Ming Li and Louxin Zhang. Twist-rotation transformations of binary trees and arithmetic expressions. Journal of Algorithms, 32(2):155-166, 1999. Google Scholar
  26. Yu Lin, Vaibhav Rajan, and Bernard ME Moret. A metric for phylogenetic trees based on matching. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 9(4):1014-1022, 2011. Google Scholar
  27. Wayne P Maddison. Gene trees in species trees. Systematic Biology, 46(3):523-536, 1997. Google Scholar
  28. Salem Malikic, Katharina Jahn, Jack Kuipers, S Cenk Sahinalp, and Niko Beerenwinkel. Integrative inference of subclonal tumour evolution from single-cell and bulk sequencing data. Nature Communications, 10(1):1-12, 2019. Google Scholar
  29. Salem Malikic, Andrew W McPherson, Nilgun Donmez, and Cenk S Sahinalp. Clonality inference in multiple tumor samples using phylogeny. Bioinformatics, 31(9):1349-1356, 2015. Google Scholar
  30. Salem Malikic, Farid Rashidi Mehrabadi, Simone Ciccolella, Md Khaledur Rahman, Camir Ricketts, Ehsan Haghshenas, Daniel Seidman, Faraz Hach, Iman Hajirasouliha, and S Cenk Sahinalp. Phiscs: a combinatorial approach for subperfect tumor phylogeny reconstruction via integrative use of single-cell and bulk sequencing data. Genome Research, 29(11):1860-1877, 2019. Google Scholar
  31. G William Moore, M Goodman, and J Barnabas. An iterative approach from the standpoint of the additive hypothesis to the dendrogram problem posed by molecular data sets. Journal of Theoretical Biology, 38(3):423-457, 1973. Google Scholar
  32. Peter C Nowell. The clonal evolution of tumor cell populations. Science, 194(4260):23-28, 1976. Google Scholar
  33. Tom MW Nye, Pietro Lio, and Walter R Gilks. A novel algorithm and web-based tool for comparing two alternative phylogenetic trees. Bioinformatics, 22(1):117-119, 2006. Google Scholar
  34. Victoria Popic, Raheleh Salari, Iman Hajirasouliha, Dorna Kashef-Haghighi, Robert B West, and Serafim Batzoglou. Fast and scalable inference of multi-sample cancer lineages. Genome Biology, 16(1):91, 2015. Google Scholar
  35. David F Robinson. Comparison of labeled trees with valency three. Journal of Combinatorial Theory, Series B, 11(2):105-119, 1971. Google Scholar
  36. David F Robinson and Leslie R Foulds. Comparison of phylogenetic trees. Mathematical Biosciences, 53(1-2):131-147, 1981. Google Scholar
  37. Bruce A Shapiro and Kaizhong Zhang. Comparing multiple rna secondary structures using tree comparisons. Bioinformatics, 6(4):309-318, 1990. Google Scholar
  38. Mike Steel and David Penny. Distributions of tree comparison metrics—some new results. Systematic Biology, 42:126-141, 1993. Google Scholar
  39. Y Tateno, M Nei, and Tajima F. Accuracy of estimated phylogenetic trees from molecular data. Journal of Molecular Evolution, 18:387-404, 1982. Google Scholar
  40. Gabriel Valiente. Algorithms on Trees and Graphs, volume 2. Springer, New York, USA, 2013. Google Scholar
  41. WT Williams and HT Clifford. On the comparison of two classifications of the same set of elements. Taxon, 20:519-522, 1971. Google Scholar
  42. H Zafar, N Navin, K Chen, and L Nakhleh. Siclonefit: Bayesian inference of population structure, genotype, and phylogeny of tumor clones from single-cell genome sequencing data. Genome Research, 29:1847-1859, 2019. Google Scholar
  43. K Zhang and D Shasha. Simple fast algorithms for the editing distance between trees and related problems. SIAM Journal on Computing, 18:1245-1262, 1989. Google Scholar