Explaining why and how a tree t structurally differs from another tree t^⋆ is a question that is encountered throughout computer science, including in understanding tree-structured data such as XML or JSON data. In this article, we explore how to learn explanations for structural differences between pairs of trees from sample data: suppose we are given a set {(t₁, t₁^⋆),… , (t_n, t_n^⋆)} of pairs of labelled, ordered trees; is there a small set of rules that explains the structural differences between all pairs (t_i, t_i^⋆)? This raises two research questions: (i) what is a good notion of "rule" in this context?; and (ii) how can sets of rules explaining a data set be learned algorithmically? We explore these questions from the perspective of database theory by (1) introducing a pattern-based specification language for tree transformations; (2) exploring the computational complexity of variants of the above algorithmic problem, e.g. showing NP-hardness for very restricted variants; and (3) discussing how to solve the problem for data from CS education research using SAT solvers.
@InProceedings{neider_et_al:LIPIcs.ICDT.2025.24, author = {Neider, Daniel and Sabellek, Leif and Schmidt, Johannes and Vehlken, Fabian and Zeume, Thomas}, title = {{Learning Tree Pattern Transformations}}, booktitle = {28th International Conference on Database Theory (ICDT 2025)}, pages = {24:1--24:20}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-364-5}, ISSN = {1868-8969}, year = {2025}, volume = {328}, editor = {Roy, Sudeepa and Kara, Ahmet}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICDT.2025.24}, URN = {urn:nbn:de:0030-drops-229652}, doi = {10.4230/LIPIcs.ICDT.2025.24}, annote = {Keywords: Tree pattern transformations, learning from positive examples, computational complexity} }
Feedback for Dagstuhl Publishing