Learning Definable Hypotheses on Trees

Grienenberger, Emilie; Ritzert, Martin

doi:10.4230/LIPIcs.ICDT.2019.24

File

Subject Classification

ACM Subject Classification

Theory of computation → Logic

Keywords

monadic second-order logic
trees
query learning

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

Abstract

We study the problem of learning properties of nodes in tree structures. Those properties are specified by logical formulas, such as formulas from first-order or monadic second-order logic. We think of the tree as a database encoding a large dataset and therefore aim for learning algorithms which depend at most sublinearly on the size of the tree. We present a learning algorithm for quantifier-free formulas where the running time only depends polynomially on the number of training examples, but not on the size of the background structure. By a previous result on strings we know that for general first-order or monadic second-order (MSO) formulas a sublinear running time cannot be achieved. However, we show that by building an index on the tree in a linear time preprocessing phase, we can achieve a learning algorithm for MSO formulas with a logarithmic learning phase.

Cite As Get BibTex

Emilie Grienenberger and Martin Ritzert. Learning Definable Hypotheses on Trees. In 22nd International Conference on Database Theory (ICDT 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 127, pp. 24:1-24:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019) https://doi.org/10.4230/LIPIcs.ICDT.2019.24

Author Details

Emilie Grienenberger

ENS Paris-Saclay, 61 Avenue du Président Wilson, 94230 Cachan, France

Martin Ritzert

RWTH Aachen University, Templergraben 55, 52062 Aachen, Germany

References

A. Abouzied, D. Angluin, C.H. Papadimitriou, J.M. Hellerstein, and A. Silberschatz. Learning and verifying quantified boolean queries by example. In R. Hull and W. Fan, editors, Proceedings of the 32nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pages 49-60, 2013.
D. Angluin. On the Complexity of Minimum Inference of Regular Sets. Information and Control, 39(3):337-350, 1978.
D. Angluin. Learning Regular Sets from Queries and Counterexamples. Information and Computation, 75(2):87-106, 1987.
D. Angluin. Negative Results for Equivalence Queries. Machine Learning, 5:121-150, 1990.
A. Balmin, Y. Papakonstantinou, and V. Vianu. Incremental validation of XML documents. ACM Trans. Database Syst., 29(4):710-751, 2004.
A. Blumer, A. Ehrenfeucht, D. Haussler, and M.K. Warmuth. Learnability and the Vapnik-Chervonenkis dimension. Journal of the ACM (JACM), 36:929-965, 1989.
M. Bojańczyk. Algorithms for regular languages that use algebra. SIGMOD Record, 41(2):5-14, 2012.
A. Bonifati, R. Ciucanu, and S. Staworko. Learning Join Queries from User Examples. ACM Trans. Database Syst., 40(4):24:1-24:38, 2016.
J Richard Büchi. Weak second-order arithmetic and finite automata. Mathematical Logic Quarterly, 6(1-6):66-92, 1960.
W.W. Cohen and C.D. Page. Polynomial Learnability and Inductive Logic Programming: Methods and Results. New generation Computing, 13:369-404, 1995.
T. Colcombet. Green’s Relations and Their Use in Automata Theory. In Language and Automata Theory and Applications - 5th International Conference, LATA 2011, Tarragona, Spain, May 26-31, 2011. Proceedings, volume 6638 of Lecture Notes in Computer Science, pages 1-21. Springer, 2011.
B. Courcelle. The monadic second-order logic of graphs. I. Recognizable sets of finite graphs. Information and computation, 85(1):12-75, 1990.
F. Drewes and J. Högberg. Learning a regular tree language from a teacher. In Developments in Language Theory, pages 279-291. Springer, 2003.
P. Garg, D. Neider, P. Madhusudan, and D. Roth. Learning invariants using decision trees and implication counterexamples. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 499-512, 2016.
E.M. Gold. Complexity of Automaton Identification from Given Data. Information and Control, 37(3):302-320, 1978.
M. Grohe, C. Löding, and M. Ritzert. Learning MSO-definable hypotheses on strings. In International Conference on Algorithmic Learning Theory, ALT 2017, 15-17 October 2017, Kyoto University, Kyoto, Japan, pages 434-451, 2017.
M. Grohe and M. Ritzert. Learning first-order definable concepts over structures of small degree. In Proceedings of the 32nd ACM-IEEE Symposium on Logic in Computer Science, 2017.
M. Grohe and G. Turán. Learnability and definability in trees and similar structures. Theory of Computing Systems, 37(1):193-220, 2004.
Dov Harel and Robert Endre Tarjan. Fast algorithms for finding nearest common ancestors. siam Journal on Computing, 13(2):338-355, 1984.
C. Jordan and L. Kaiser. Machine Learning with Guarantees using Descriptive Complexity and SMT Solvers. ArXiv (CoRR), 1609.02664 [cs.LG], 2016. URL: http://arxiv.org/abs/1609.02664.
M.J. Kearns and L.G. Valiant. Cryptographic Limitations on Learning Boolean Formulae and Finite Automata. Journal of the ACM, 41(1):67-95, 1994.
J.-U. Kietz and S. Dzeroski. Inductive Logic Programming and Learnability. SIGART Bulletin, 5(1):22-32, 1994.
C. Löding, P. Madhusudan, and D. Neider. Abstract Learning Frameworks for Synthesis. In M. Chechik and J.-F. Raskin, editors, Proceedings of the 22nd International Conference on Tools and Algorithms for the Construction and Analysis of Systems, volume 9636 of Lecture Notes in Computer Science, pages 167-185. Springer Verlag, 2016.
S. Muggleton. Inductive logic programming. New Generation Computing, 8(4):295-318, 1991.
S.H. Muggleton, editor. Inductive Logic Programming. Academic Press, 1992.
S.H. Muggleton and L. De Raedt. Inductive Logic Programming: Theory and methods. The Journal of Logic Programming, 19-20:629-679, 1994.
J. Oncina and P. García. Identifying regular languages in polynomial time. In Proceedings of the International Workshop on Structural and Syntactic Pattern Recognition, volume 5 of Machine Perception and Artificial Intelligence, pages 99 - -108. World Scientific, 1992.
L. Pitt and M.K. Warmuth. The Minimum Consistent DFA Problem Cannot be Approximated within any Polynomial. Journal of the ACM, 40(1):95-142, 1993.
M.O. Rabin and D.Scott. Finite Automata and Their Decision Problems. IBM Journal of Research and Development, 3:114-125, 1959.
R.L. Rivest and R.E. Schapire. Inference of Finite Automata Using Homing Sequences. In Machine Learning: From Theory to Applications, volume 661 of Lecture Notes in Computer Science, pages 51-73. Springer, 1993.
I. Simon. Factorization Forests of Finite Height. Theoretical Computer Science, 72(1):65-94, 1990.
Sławek Staworko and Piotr Wieczorek. Learning twig and path queries. In Proceedings of the 15th International Conference on Database Theory, pages 140-154. ACM, 2012.
W. Thomas. Languages, Automata, and Logic. In G. Rozenberg and A. Salomaa, editors, Handbook of Formal Languages, volume 3, pages 389-456. Springer-Verlag, 1997.
L.G. Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134-1142, 1984.
V. Vapnik and A. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications, 16:264-280, 1971.
Y. Weiss and S. Cohen. Reverse Engineering SPJ-Queries from Examples. In Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, pages 151-166. ACM, 2017.

Learning Definable Hypotheses on Trees

Authors Emilie Grienenberger, Martin Ritzert

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message

Learning Definable Hypotheses on Trees

Authors Emilie Grienenberger, Martin Ritzert

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Funding

References

Thanks for your feedback!

Could not send message