New and Improved Algorithms for Unordered Tree Inclusion

Authors Tatsuya Akutsu, Jesper Jansson, Ruiming Li, Atsuhiro Takasu, Takeyuki Tamura



PDF
Thumbnail PDF

File

LIPIcs.ISAAC.2018.27.pdf
  • Filesize: 0.51 MB
  • 12 pages

Document Identifiers

Author Details

Tatsuya Akutsu
  • Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto 611-0011, Japan
Jesper Jansson
  • Department of Computing, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong, China
Ruiming Li
  • Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto 611-0011, Japan
Atsuhiro Takasu
  • National Institute of Informatics, Chiyoda-ku, Tokyo, 101-8430, Japan
Takeyuki Tamura
  • Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto 611-0011, Japan

Cite AsGet BibTex

Tatsuya Akutsu, Jesper Jansson, Ruiming Li, Atsuhiro Takasu, and Takeyuki Tamura. New and Improved Algorithms for Unordered Tree Inclusion. In 29th International Symposium on Algorithms and Computation (ISAAC 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 123, pp. 27:1-27:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)
https://doi.org/10.4230/LIPIcs.ISAAC.2018.27

Abstract

The tree inclusion problem is, given two node-labeled trees P and T (the "pattern tree" and the "text tree"), to locate every minimal subtree in T (if any) that can be obtained by applying a sequence of node insertion operations to P. Although the ordered tree inclusion problem is solvable in polynomial time, the unordered tree inclusion problem is NP-hard. The currently fastest algorithm for the latter is from 1995 and runs in O(poly(m,n) * 2^{2d}) = O^*(2^{2d}) time, where m and n are the sizes of the pattern and text trees, respectively, and d is the maximum outdegree of the pattern tree. Here, we develop a new algorithm that improves the exponent 2d to d by considering a particular type of ancestor-descendant relationships and applying dynamic programming, thus reducing the time complexity to O^*(2^d). We then study restricted variants of the unordered tree inclusion problem where the number of occurrences of different node labels and/or the input trees' heights are bounded. We show that although the problem remains NP-hard in many such cases, it can be solved in polynomial time for c = 2 and in O^*(1.8^d) time for c = 3 if the leaves of P are distinctly labeled and each label occurs at most c times in T. We also present a randomized O^*(1.883^d)-time algorithm for the case that the heights of P and T are one and two, respectively.

Subject Classification

ACM Subject Classification
  • Theory of computation → Graph algorithms analysis
Keywords
  • parameterized algorithms
  • tree inclusion
  • unordered trees
  • dynamic programming

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Amir Abboud, Arturs Backurs, Thomas Dueholm Hansen, Virginia Vassilevska Williams, and Or Zamir. Subtree isomorphism revisited. In Proceedings of the 27th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1256-1271. SIAM, 2018. Google Scholar
  2. Amir Abboud, Virginia Vassilevska Williams, and Oren Weimann. Consequences of faster alignment of sequences. In Proceedings of the 41st International Colloquium on Automata, Languages, and Programming - Part 1, pages 39-51. Springer, 2014. Google Scholar
  3. Tatsuya Akutsu, Daiji Fukagawa, Magnús M. Halldórsson, Atsuhiro Takasu, and Keisuke Tanaka. Approximation and parameterized algorithms for common subtrees and edit distance between unordered trees. Theoretical Computer Science, 470:10-22, 2013. Google Scholar
  4. Tatsuya Akutsu, Daiji Fukagawa, Atsuhiro Takasu, and Takeyuki Tamura. Exact algorithms for computing the tree edit distance between unordered trees. Theoretical Computer Science, 412(4-5):352-364, 2011. Google Scholar
  5. Tatsuya Akutsu, Takeyuki Tamura, Daiji Fukagawa, and Atsuhiro Takasu. Efficient exponential-time algorithms for edit distance between unordered trees. Journal of Discrete Algorithms, 25:79-93, 2014. Google Scholar
  6. Noga Alon, Raphael Yuster, and Uri Zwick. Color-coding. Journal of the ACM, 42(4):844-856, 1995. Google Scholar
  7. Philip Bille. A survey on tree edit distance and related problems. Theoretical Computer Science, 337(1):217-239, 2005. Google Scholar
  8. Philip Bille and Inge Li Gørtz. The tree inclusion problem: In linear space and faster. ACM Transactions on Algorithms (TALG), 7(3):38, 2011. Google Scholar
  9. Karl Bringmann, Pawel Gawrychowski, Shay Mozes, and Oren Weimann. Tree edit distance cannot be computed in strongly subcubic time (unless APSP can). In Proceedings of the 29th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1190-1206. SIAM, 2018. Google Scholar
  10. Lijun Chang, Xuemin Lin, Wenjie Zhang, Jeffrey Xu Yu, Ying Zhang, and Lu Qin. Optimal enumeration: Efficient top-k tree matching. Proceedings of the VLDB Endowment, 8(5):533-544, 2015. Google Scholar
  11. Sara Cohen and Nerya Or. A general algorithm for subtree similarity-search. In Data Engineering (ICDE), 2014 IEEE 30th International Conference on, pages 928-939. IEEE, 2014. Google Scholar
  12. Erik D. Demaine, Shay Mozes, Benjamin Rossman, and Oren Weimann. An optimal decomposition algorithm for tree edit distance. ACM Transactions on Algorithms (TALG), 6(1):2, 2009. Google Scholar
  13. Minoru Kanehisa, Susumu Goto, Yoko Sato, Masayuki Kawashima, Miho Furumichi, and Mao Tanabe. Data, information, knowledge and principle: back to metabolism in KEGG. Nucleic Acids Research, 42(D1):D199-D205, 2013. Google Scholar
  14. Pekka Kilpeläinen and Heikki Mannila. Ordered and unordered tree inclusion. SIAM Journal on Computing, 24(2):340-356, 1995. Google Scholar
  15. Hanna Köpcke, Andreas Thor, and Erhard Rahm. Evaluation of entity resolution approaches on real-world match problems. Proceedings of the VLDB Endowment, 3(1-2):484-493, 2010. Google Scholar
  16. Jiwei Li, Thang Luong, Dan Jurafsky, and Eduard H. Hovy. When Are Tree Structures Necessary for Deep Learning of Representations? In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17-21, 2015, pages 2304-2314, 2015. URL: http://aclweb.org/anthology/D/D15/D15-1278.pdf.
  17. Jiří Matoušek and Robin Thomas. On the complexity of finding iso-and other morphisms for partial k-trees. Discrete Mathematics, 108(1-3):343-364, 1992. Google Scholar
  18. Tomoya Mori, Atsuhiro Takasu, Jesper Jansson, Jaewook Hwang, Takeyuki Tamura, and Tatsuya Akutsu. Similar subtree search using extended tree inclusion. IEEE Transactions on Knowledge and Data Engineering, 27(12):3360-3373, 2015. Google Scholar
  19. Dennis Shasha, Jason T. L. Wang, Kaizhong Zhang, and Frank Y. Shih. Exact and approximate algorithms for unordered tree matching. IEEE Transactions on Systems, Man, and Cybernetics, 24(4):668-678, 1994. Google Scholar
  20. Kuo-Chung Tai. The tree-to-tree correction problem. Journal of the ACM (JACM), 26(3):422-433, 1979. Google Scholar
  21. Gabriel Valiente. Constrained tree inclusion. Journal of Discrete Algorithms, 3(2):431-447, 2005. Google Scholar
  22. Masaki Yamamoto. An improved O^∗(1.234^m)-time deterministic algorithm for SAT. In Proceedings of the 16th International Symposium on Algorithms and Computation, pages 644-653. Springer, 2005. Google Scholar
  23. Mohammed Javeed Zaki. Efficiently mining frequent trees in a forest: Algorithms and applications. IEEE Transactions on Knowledge and Data Engineering, 17(8):1021-1035, 2005. Google Scholar
  24. Kaizhong Zhang and Tao Jiang. Some MAX SNP-hard results concerning unordered labeled trees. Information Processing Letters, 49(5):249-254, 1994. Google Scholar
  25. Kaizhong Zhang, Rick Statman, and Dennis Shasha. On the editing distance between unordered labeled trees. Information Processing Letters, 42(3):133-139, 1992. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail