A Rearrangement Distance for Fully-Labelled Trees

Authors Giulia Bernardini, Paola Bonizzoni, Gianluca Della Vedova, Murray Patterson

Thumbnail PDF


  • Filesize: 497 kB
  • 15 pages

Document Identifiers

Author Details

Giulia Bernardini
  • DISCo, Università degli Studi Milano - Bicocca, Italy
Paola Bonizzoni
  • DISCo, Università degli Studi Milano - Bicocca, Italy
Gianluca Della Vedova
  • DISCo, Università degli Studi Milano - Bicocca, Italy
Murray Patterson
  • DISCo, Università degli Studi Milano - Bicocca, Italy


The authors wish to thank Mauricio Soto Gomez for the inspiring discussions.

Cite AsGet BibTex

Giulia Bernardini, Paola Bonizzoni, Gianluca Della Vedova, and Murray Patterson. A Rearrangement Distance for Fully-Labelled Trees. In 30th Annual Symposium on Combinatorial Pattern Matching (CPM 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 128, pp. 28:1-28:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


The problem of comparing trees representing the evolutionary histories of cancerous tumors has turned out to be crucial, since there is a variety of different methods which typically infer multiple possible trees. A departure from the widely studied setting of classical phylogenetics, where trees are leaf-labelled, tumoral trees are fully labelled, i.e., every vertex has a label. In this paper we provide a rearrangement distance measure between two fully-labelled trees. This notion originates from two operations: one which modifies the topology of the tree, the other which permutes the labels of the vertices, hence leaving the topology unaffected. While we show that the distance between two trees in terms of each such operation alone can be decided in polynomial time, the more general notion of distance when both operations are allowed is NP-hard to decide. Despite this result, we show that it is fixed-parameter tractable, and we give a 4-approximation algorithm when one of the trees is binary.

Subject Classification

ACM Subject Classification
  • Mathematics of computing → Trees
  • Mathematics of computing → Graph theory
  • Tree rearrangement distance
  • Cancer progression
  • Approximation algorithms
  • Computational complexity


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. A. Aho, J. Hopcroft, and J. Ullman. The Design and Analysis of Computer Algorithms, pages 84-85. Addison-Wesley Publishing Co., 1974. Google Scholar
  2. B.L. Allen and M. Steel. Subtree transfer operations and their induced metrics on evolutionary trees. Annals of Combinatorics, 5:1–13, 2001. Google Scholar
  3. P. Bille. A survey on tree edit distance and related problems. Theoretical Computer Science, 337(1):217-239, 2005. Google Scholar
  4. P. Bonizzoni, S. Ciccolella, G. Della Vedova, and M. Soto. Beyond perfect phylogeny: Multisample phylogeny reconstruction via ilp. In Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, pages 1-10. ACM, 2017. Google Scholar
  5. M. Bordewich and C. Semple. On the computational complexity of the rooted subtree prune and regraft distance. Annals of Combintorics, 8:409–423, 2005. Google Scholar
  6. D.M. Campbell and D. Radford. Tree isomorphism algorithms: Speed vs. clarity. Mathematics Magazine, 64(4):252-261, 1991. Google Scholar
  7. B. Chor and T. Tuller. Maximum likelihood of evolutionary trees: hardness and approximation. Bioinformatics, 21(1):i97-i106, 2005. Google Scholar
  8. S. Ciccolella, M. Soto Gomez, M. Patterson, G. Della Vedova, I. Hajirasouliha, and P. Bonizzoni. gpps: an ILP-based approach for inferring cancer progression with mutation losses from single cell data. In 8th IEEE International Conference on Computational Advances in Bio and Medical Sciences, ICCABS 2018, Las Vegas, NV, USA, October 18-20, 2018, page 1. IEEE Computer Society, 2018. Google Scholar
  9. B. DasGupta, X. He, T. Jiang, M. Li, J. Tromp, and L. Zhang. On distances between phylogenetic trees. In The 8th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), page 427–436, 1997. Google Scholar
  10. R.G. Downey and M.R. Fellows. Parameterized Complexity. Springer-Verlag, New York, USA, 1999. Google Scholar
  11. M. El-Kebir, L. Oesper, H. Acheson-Field, and B.J. Raphael. Reconstruction of clonal trees and tumor composition from multi-sample sequencing data. Bioinformatics, 31(12):i62-i70, 2015. Google Scholar
  12. J.S. Farris. Methods for computing Wagner trees. Systematic Zoology, 19:83-92, 1970. Google Scholar
  13. J. Felsenstein. Inferring Phylogenies, volume 2. Sinauer Associates, 2004. Google Scholar
  14. W.M. Fitch. Toward defining the course of evolution: minimum change for a specified tree topology. Systematic Zoology, 20(4):406-416, 1971. Google Scholar
  15. M.L. Fredman and R.E. Tarjan. Fibonacci Heaps and Their Uses in Improved Network Optimization Algorithms. J. ACM, 34(3):596-615, July 1987. Google Scholar
  16. K. Govek, C. Sikes, and L. Oesper. A Consensus Approach to Infer Tumor Evolutionary Histories. In Proceedings of the 9th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM-BCB), pages 63-72, 2018. Google Scholar
  17. I. Hajirasouliha, A. Mahmoody, and B.J. Raphael. A combinatorial approach for analyzing intra-tumor heterogeneity from high-throughput sequencing data. Bioinformatics, 30(12):i78-i86, June 2014. Google Scholar
  18. K. Jahn, J. Kuipers, and N. Beerenwinkel. Tree inference for single-cell data. Genome Biology, 17(1):86, May 2016. Google Scholar
  19. H. Jiang, J. Ma, J. Luan, and D. Zhu. Approximation and Nonapproximability for the One-Sided Scaffold Filling Problem. In International Computing and Combinatorics Conference (COCOON 2015), pages 251-263, 2015. Google Scholar
  20. W Jiao, S. Vembu, A.G. Deshwar, L. Stein, and Q. Morris. Inferring clonal evolution of tumors from single nucleotide somatic mutations. BMC Bioinformatics, 15(1):35, 2014. Google Scholar
  21. R.M. Karp. Reducibility among combinatorial problems. Complexity of Computer Computations, pages 85-103, 1972. Google Scholar
  22. N. Karpov, S. Malikic, M.K. Rahman, and S.C. Sahinalp. A Multi-labeled Tree Edit Distance for Comparing "Clonal Trees" of Tumor Progression. In 18th International Workshop on Algorithms in Bioinformatics, WABI 2018, August 20-22, 2018, Helsinki, Finland, pages 22:1-22:19, 2018. Google Scholar
  23. M.K. Kuhner and J. Yamato. Practical Performance of Tree Comparison Metrics. Systematic Biology, 64(2):205-214, December 2014. Google Scholar
  24. J. Kuipers, K. Jahn, B.J. Raphael, and N. Beerenwinkel. Single-cell sequencing data reveal widespread recurrence and loss of mutational hits in the life histories of tumors. Genome research, 27(11):1885-1894, 2017. Google Scholar
  25. M. McVicar, B. Sach, C. Mesnage, J. Lijffijt, E. Spyropoulou, and T. De Bie. SuMoTED: An intuitive edit distance between rooted unordered uniquely-labelled trees. Pattern Recognition Letters, 79:52-59, 2016. Google Scholar
  26. A.S. Morrissy and L. Garzia et al. Divergent clonal selection dominates medulloblastoma at recurrence. Nature, 529:351 EP, January 2016. Google Scholar
  27. P.C. Nowell. The clonal evolution of tumor cell populations. Science, 194:23–28, 1976. Google Scholar
  28. M. Pawlik and N. Augsten. Efficient Computation of the Tree Edit Distance. ACM Transactions on Database Systems (TODS), 40(1), 2015. Google Scholar
  29. V. Popic, R. Salari, I. Hajirasouliha, D. Kashef-Haghighi, R.B. West, and S. Batzoglou. Fast and scalable inference of multi-sample cancer lineages. Genome Biology, 16(1):91, 2015. Google Scholar
  30. D.F. Robinson and L.R. Foulds. Comparison of weighted labeled trees. Lecture Notes in Mathematics, 748:119-126, 1979. Google Scholar
  31. E.M. Ross and F. Markowetz. OncoNEM: inferring tumor evolution from single-cell sequencing data. Genome Biology, 17(1):69, April 2016. Google Scholar
  32. S. Salehi, A. Steif, A. Roth, S. Aparicio, A. Bouchard-Côté, and S.P. Shah. ddClone: joint statistical inference of clonal populations from single cell and bulk tumour sequencing data. Genome Biology, 18(1):44, March 2017. Google Scholar
  33. C. Semple and M. Steel. Phylogenetics. Oxford Lecture Series in Mathematics and Its Applications. Oxford University Press, USA, 2003. Google Scholar
  34. D.D. Sleator and R.E. Tarjan. A Data Structure for Dynamic Trees. In Proceedings of the Thirteenth Annual ACM Symposium on Theory of Computing, STOC '81, pages 114-122, New York, NY, USA, 1981. ACM. Google Scholar
  35. M. A. Steel. Phylogeny: discrete and random processes in evolution. Number 89 in CBMS-NSF regional conference series in applied mathematics. Society for Industrial and Applied Mathematics, Philadelphia, 2016. Google Scholar
  36. K.C. Tai. The Tree-to-Tree Correction Problem. J. ACM, 26(3):422-433, July 1979. Google Scholar
  37. J. Wang, E. Cazzato, E. Ladewig, V. Frattini, D.I.S. Rosenbloom, S. Zairis, F. Abate, Z. Liu, O. Elliott, Y. Shi n, J. Lee, I. Lee, W. Park, M. Eoli, A.J. Blumberg, A. Lasorella, D. Nam, G. Finocchiaro, A. Iavarone, and R. Rabadan. Clonal evolution of glioblastoma under therapy. Nature Genetics, 48:768 EP, June 2016. Google Scholar
  38. K. Yuan, T. Sakoparnig, F. Markowetz, and N. Beerenwinkel. BitPhylogeny: a probabilistic framework for reconstructing intra-tumor phylogenies. Genome Biology, 16(1):36, February 2015. Google Scholar
  39. K.Z. Zhang and D. Shasha. Simple Fast Algorithms for the Editing Distance between Trees and Related Problems. SIAM Journal on Computing, 18(6):1245-1262, 1989. Google Scholar
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail