Document

# New and Improved Algorithms for Unordered Tree Inclusion

## File

LIPIcs.ISAAC.2018.27.pdf
• Filesize: 0.51 MB
• 12 pages

## Cite As

Tatsuya Akutsu, Jesper Jansson, Ruiming Li, Atsuhiro Takasu, and Takeyuki Tamura. New and Improved Algorithms for Unordered Tree Inclusion. In 29th International Symposium on Algorithms and Computation (ISAAC 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 123, pp. 27:1-27:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)
https://doi.org/10.4230/LIPIcs.ISAAC.2018.27

## Abstract

The tree inclusion problem is, given two node-labeled trees P and T (the "pattern tree" and the "text tree"), to locate every minimal subtree in T (if any) that can be obtained by applying a sequence of node insertion operations to P. Although the ordered tree inclusion problem is solvable in polynomial time, the unordered tree inclusion problem is NP-hard. The currently fastest algorithm for the latter is from 1995 and runs in O(poly(m,n) * 2^{2d}) = O^*(2^{2d}) time, where m and n are the sizes of the pattern and text trees, respectively, and d is the maximum outdegree of the pattern tree. Here, we develop a new algorithm that improves the exponent 2d to d by considering a particular type of ancestor-descendant relationships and applying dynamic programming, thus reducing the time complexity to O^*(2^d). We then study restricted variants of the unordered tree inclusion problem where the number of occurrences of different node labels and/or the input trees' heights are bounded. We show that although the problem remains NP-hard in many such cases, it can be solved in polynomial time for c = 2 and in O^*(1.8^d) time for c = 3 if the leaves of P are distinctly labeled and each label occurs at most c times in T. We also present a randomized O^*(1.883^d)-time algorithm for the case that the heights of P and T are one and two, respectively.

## Subject Classification

##### ACM Subject Classification
• Theory of computation → Graph algorithms analysis
##### Keywords
• parameterized algorithms
• tree inclusion
• unordered trees
• dynamic programming

## Metrics

• Access Statistics
• Total Accesses (updated on a weekly basis)
0

## References

1. Amir Abboud, Arturs Backurs, Thomas Dueholm Hansen, Virginia Vassilevska Williams, and Or Zamir. Subtree isomorphism revisited. In Proceedings of the 27th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1256-1271. SIAM, 2018.
2. Amir Abboud, Virginia Vassilevska Williams, and Oren Weimann. Consequences of faster alignment of sequences. In Proceedings of the 41st International Colloquium on Automata, Languages, and Programming - Part 1, pages 39-51. Springer, 2014.
3. Tatsuya Akutsu, Daiji Fukagawa, Magnús M. Halldórsson, Atsuhiro Takasu, and Keisuke Tanaka. Approximation and parameterized algorithms for common subtrees and edit distance between unordered trees. Theoretical Computer Science, 470:10-22, 2013.
4. Tatsuya Akutsu, Daiji Fukagawa, Atsuhiro Takasu, and Takeyuki Tamura. Exact algorithms for computing the tree edit distance between unordered trees. Theoretical Computer Science, 412(4-5):352-364, 2011.
5. Tatsuya Akutsu, Takeyuki Tamura, Daiji Fukagawa, and Atsuhiro Takasu. Efficient exponential-time algorithms for edit distance between unordered trees. Journal of Discrete Algorithms, 25:79-93, 2014.
6. Noga Alon, Raphael Yuster, and Uri Zwick. Color-coding. Journal of the ACM, 42(4):844-856, 1995.
7. Philip Bille. A survey on tree edit distance and related problems. Theoretical Computer Science, 337(1):217-239, 2005.
8. Philip Bille and Inge Li Gørtz. The tree inclusion problem: In linear space and faster. ACM Transactions on Algorithms (TALG), 7(3):38, 2011.
9. Karl Bringmann, Pawel Gawrychowski, Shay Mozes, and Oren Weimann. Tree edit distance cannot be computed in strongly subcubic time (unless APSP can). In Proceedings of the 29th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1190-1206. SIAM, 2018.
10. Lijun Chang, Xuemin Lin, Wenjie Zhang, Jeffrey Xu Yu, Ying Zhang, and Lu Qin. Optimal enumeration: Efficient top-k tree matching. Proceedings of the VLDB Endowment, 8(5):533-544, 2015.
11. Sara Cohen and Nerya Or. A general algorithm for subtree similarity-search. In Data Engineering (ICDE), 2014 IEEE 30th International Conference on, pages 928-939. IEEE, 2014.
12. Erik D. Demaine, Shay Mozes, Benjamin Rossman, and Oren Weimann. An optimal decomposition algorithm for tree edit distance. ACM Transactions on Algorithms (TALG), 6(1):2, 2009.
13. Minoru Kanehisa, Susumu Goto, Yoko Sato, Masayuki Kawashima, Miho Furumichi, and Mao Tanabe. Data, information, knowledge and principle: back to metabolism in KEGG. Nucleic Acids Research, 42(D1):D199-D205, 2013.
14. Pekka Kilpeläinen and Heikki Mannila. Ordered and unordered tree inclusion. SIAM Journal on Computing, 24(2):340-356, 1995.
15. Hanna Köpcke, Andreas Thor, and Erhard Rahm. Evaluation of entity resolution approaches on real-world match problems. Proceedings of the VLDB Endowment, 3(1-2):484-493, 2010.
16. Jiwei Li, Thang Luong, Dan Jurafsky, and Eduard H. Hovy. When Are Tree Structures Necessary for Deep Learning of Representations? In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17-21, 2015, pages 2304-2314, 2015. URL: http://aclweb.org/anthology/D/D15/D15-1278.pdf.
17. Jiří Matoušek and Robin Thomas. On the complexity of finding iso-and other morphisms for partial k-trees. Discrete Mathematics, 108(1-3):343-364, 1992.
18. Tomoya Mori, Atsuhiro Takasu, Jesper Jansson, Jaewook Hwang, Takeyuki Tamura, and Tatsuya Akutsu. Similar subtree search using extended tree inclusion. IEEE Transactions on Knowledge and Data Engineering, 27(12):3360-3373, 2015.
19. Dennis Shasha, Jason T. L. Wang, Kaizhong Zhang, and Frank Y. Shih. Exact and approximate algorithms for unordered tree matching. IEEE Transactions on Systems, Man, and Cybernetics, 24(4):668-678, 1994.
20. Kuo-Chung Tai. The tree-to-tree correction problem. Journal of the ACM (JACM), 26(3):422-433, 1979.
21. Gabriel Valiente. Constrained tree inclusion. Journal of Discrete Algorithms, 3(2):431-447, 2005.
22. Masaki Yamamoto. An improved O^∗(1.234^m)-time deterministic algorithm for SAT. In Proceedings of the 16th International Symposium on Algorithms and Computation, pages 644-653. Springer, 2005.
23. Mohammed Javeed Zaki. Efficiently mining frequent trees in a forest: Algorithms and applications. IEEE Transactions on Knowledge and Data Engineering, 17(8):1021-1035, 2005.
24. Kaizhong Zhang and Tao Jiang. Some MAX SNP-hard results concerning unordered labeled trees. Information Processing Letters, 49(5):249-254, 1994.
25. Kaizhong Zhang, Rick Statman, and Dennis Shasha. On the editing distance between unordered labeled trees. Information Processing Letters, 42(3):133-139, 1992.