Edit Distance between Unrooted Trees in Cubic Time

Authors Bartlomiej Dudek, Pawel Gawrychowski



PDF
Thumbnail PDF

File

LIPIcs.ICALP.2018.45.pdf
  • Filesize: 0.65 MB
  • 14 pages

Document Identifiers

Author Details

Bartlomiej Dudek
  • Institute of Computer Science, University of Wrocław, Poland
Pawel Gawrychowski
  • Institute of Computer Science, University of Wrocław, Poland

Cite AsGet BibTex

Bartlomiej Dudek and Pawel Gawrychowski. Edit Distance between Unrooted Trees in Cubic Time. In 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 107, pp. 45:1-45:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)
https://doi.org/10.4230/LIPIcs.ICALP.2018.45

Abstract

Edit distance between trees is a natural generalization of the classical edit distance between strings, in which the allowed elementary operations are contraction, uncontraction and relabeling of an edge. Demaine et al. [ACM Trans. on Algorithms, 6(1), 2009] showed how to compute the edit distance between rooted trees on n nodes in O(n^3) time. However, generalizing their method to unrooted trees seems quite problematic, and the most efficient known solution remains to be the previous O(n^3 log n) time algorithm by Klein [ESA 1998]. Given the lack of progress on improving this complexity, it might appear that unrooted trees are simply more difficult than rooted trees. We show that this is, in fact, not the case, and edit distance between unrooted trees on n nodes can be computed in O(n^3) time. A significantly faster solution is unlikely to exist, as Bringmann et al. [SODA 2018] proved that the complexity of computing the edit distance between rooted trees cannot be decreased to O(n^{3-epsilon}) unless some popular conjecture fails, and the lower bound easily extends to unrooted trees. We also show that for two unrooted trees of size m and n, where m <=n, our algorithm can be modified to run in O(nm^2(1+log(n/m))). This, again, matches the complexity achieved by Demaine et al. for rooted trees, who also showed that this is optimal if we restrict ourselves to the so-called decomposition algorithms.

Subject Classification

ACM Subject Classification
  • Theory of computation → Design and analysis of algorithms
Keywords
  • tree edit distance
  • dynamic programming
  • heavy light decomposition

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Amir Abboud, Thomas Dueholm Hansen, Virginia Vassilevska Williams, and Ryan Williams. Simulating branching programs with edit distance and friends: or: a polylog shaved is a lower bound made. In 48th STOC, pages 375-388, 2016. Google Scholar
  2. Tatsuya Akutsu, Daiji Fukagawa, and Atsuhiro Takasu. Approximating tree edit distance through string edit distance. Algorithmica, 57(2):325-348, 2010. Google Scholar
  3. Rajeev Alur, Loris D'Antoni, Sumit Gulwani, Dileep Kini, and Mahesh Viswanathan. Automated grading of DFA constructions. In 23rd IJCAI, pages 1976-1982, 2013. Google Scholar
  4. Taku Aratsu, Kouichi Hirata, and Tetsuji Kuboyama. Approximating tree edit distance through string edit distance for binary tree codes. Fundam. Inf., 101(3):157-171, 2010. Google Scholar
  5. Arturs Backurs and Piotr Indyk. Edit distance cannot be computed in strongly subquadratic time (unless SETH is false). In 47th STOC, pages 51-58, 2015. Google Scholar
  6. J. Bellando and R. Kothari. Region-based modeling and tree edit distance as a basis for gesture recognition. In 10th ICIAP, pages 698-703, 1999. Google Scholar
  7. Philip Bille. A survey on tree edit distance and related problems. Theor. Comput. Sci., 337(1-3):217-239, 2005. Google Scholar
  8. Norbert Blum and Kurt Mehlhorn. On the average number of rebalancing operations in weight-balanced trees. Theor. Comput. Sci., 11(3):303-320, 1980. Google Scholar
  9. Karl Bringmann, Paweł Gawrychowski, Shay Mozes, and Oren Weimann. Tree edit distance cannot be computed in strongly subcubic time (unless APSP can). In 29th SODA, 2018. Google Scholar
  10. Peter Buneman, Martin Grohe, and Christoph Koch. Path queries on compressed XML. In 29th VLDB, pages 141-152, 2003. Google Scholar
  11. Sudarshan S. Chawathe. Comparing hierarchical data in external memory. In 25th VLDB, pages 90-101, 1999. Google Scholar
  12. Richard Cole, Lee-Ad Gottlieb, and Moshe Lewenstein. Dictionary matching and indexing with errors and don't cares. In 36th STOC, pages 91-100, 2004. Google Scholar
  13. Erik D. Demaine, Shay Mozes, Benjamin Rossman, and Oren Weimann. An optimal decomposition algorithm for tree edit distance. ACM Trans. Algorithms, 6(1):2:1-2:19, 2009. Google Scholar
  14. Bartłomiej Dudek and Paweł Gawrychowski. Edit distance between unrooted trees in cubic time. CoRR, abs/1804.10186, 2018. URL: http://arxiv.org/abs/1804.10186.
  15. Serge Dulucq and Hélène Touzet. Decomposition algorithms for the tree edit distance problem. J. Discrete Algorithms, 3(2-4):448-471, 2005. Google Scholar
  16. Paolo Ferragina, Fabrizio Luccio, Giovanni Manzini, and S. Muthukrishnan. Compressing and indexing labeled trees, with applications. J. ACM, 57(1):4:1-4:33, 2009. Google Scholar
  17. Matthias Höchsmann, Thomas Töller, Robert Giegerich, and Stefan Kurtz. Local similarity in RNA secondary structures. In 2nd CSB, pages 159-168, 2003. Google Scholar
  18. Christoph M. Hoffmann and Michael J. O'Donnell. Pattern matching in trees. J. ACM, 29(1):68-95, Jan. 1982. Google Scholar
  19. Egor Ivkin. Approximating tree edit distance through string edit distance for binary tree codes. B.Sc. thesis, Charles University in Prague, 2012. Google Scholar
  20. Philip Klein, Srikanta Tirthapura, Daniel Sharvit, and Ben Kimia. A tree-edit-distance algorithm for comparing simple, closed shapes. In 11th SODA, pages 696-704, 2000. Google Scholar
  21. Philip N. Klein. Computing the edit-distance between unrooted ordered trees. In 6th ESA, pages 91-102, 1998. Google Scholar
  22. Philip N. Klein, Thomas B. Sebastian, and Benjamin B. Kimia. Shape matching using edit-distance: An implementation. In 12th SODA, pages 781-790, 2001. Google Scholar
  23. Mateusz Pawlik and Nikolaus Augsten. Efficient computation of the tree edit distance. ACM Trans. Database Syst., 40(1):3:1-3:40, 2015. Google Scholar
  24. Juan Ramón Rico-Juan and Luisa Micó. Comparison of aesa and laesa search algorithms using string and tree-edit-distances. Pattern Recogn. Lett., 24(9-10):1417-1426, jun 2003. Google Scholar
  25. T. B. Sebastian, P. N. Klein, and B. B. Kimia. Recognition of shapes by editing their shock graphs. IEEE Trans. Pattern Anal. Mach. Intell., 26(5):550-571, May 2004. Google Scholar
  26. B. A. Shapiro and K. Z. Zhang. Comparing multiple RNA secondary structures using tree comparisons. Comput. Appl. Biosci., 6(4):309-318, Oct. 1990. Google Scholar
  27. Dennis Shasha and Kaizhong Zhang. Fast algorithms for the unit cost editing distance between trees. J. Algorithms, 11(4):581-621, dec 1990. Google Scholar
  28. Daniel D. Sleator and Robert Endre Tarjan. A data structure for dynamic trees. Journal of Computer and System Sciences, 26(3):362-391, 1983. Google Scholar
  29. Kuo-Chung Tai. The tree-to-tree correction problem. J. ACM, 26(3):422-433, 1979. Google Scholar
  30. Robert A. Wagner and Michael J. Fischer. The string-to-string correction problem. J. ACM, 21(1):168-173, 1974. Google Scholar
  31. Xuchen Yao, Benjamin Van Durme, Chris Callison-Burch, and Peter Clark. Answer extraction as sequence tagging with tree edit distance. In HLT-NAACL, pages 858-867, 2013. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail