Inferring Symbolic Automata

Fisman, Dana; Frenkel, Hadar; Zilles, Sandra

doi:10.4230/LIPIcs.CSL.2022.21

Abstract

We study the learnability of symbolic finite state automata, a model shown useful in many applications in software verification. The state-of-the-art literature on this topic follows the query learning paradigm, and so far all obtained results are positive. We provide a necessary condition for efficient learnability of SFAs in this paradigm, from which we obtain the first negative result. The main focus of our work lies in the learnability of SFAs under the paradigm of identification in the limit using polynomial time and data. We provide a necessary condition and a sufficient condition for efficient learnability of SFAs in this paradigm, from which we derive a positive and a negative result.

F. Aarts and F. Vaandrager. Learning I/O automata. In Concurrency Theory, 21th Int. Conf., CONCUR 2010, pages 71-85, 2010.
R. Alur, L. Fix, and T. A. Henzinger. Event-clock automata: A determinizable class of timed automata. Theor. Comput. Sci., 211(1-2):253-273, 1999.
D. Angluin. A note on the number of queries needed to identify regular languages. Inf. Control., 51(1):76-87, 1981.
D. Angluin. Learning regular sets from queries and counterexamples. Inf. Comput., 75(2):87-106, 1987.
D. Angluin. Queries and concept learning. Machine Learning, 2(4):319-342, 1988.
D. Angluin. Negative results for equivalence queries. Mach. Learn., 5:121-150, 1990.
D. Angluin, S. Eisenstat, and D. Fisman. Learning regular languages via alternating automata. In Proc. of the Twenty-Fourth Int. Joint Conf. on Artificial Intelligence, IJCAI, pages 3308-3314, 2015.
D. Angluin and D. Fisman. Learning regular omega languages. Theor. Comput. Sci., 650:57-72, 2016.
G. Argyros and L. D'Antoni. The learnability of symbolic automata. In Computer Aided Verification - 30th Int. Conf., CAV, volume 10981 of LNCS, pages 427-445. Springer, 2018.
G. Argyros, I. Stais, S. Jana, A. D. Keromytis, and A. Kiayias. Sfadiff: Automated evasion attacks and fingerprinting using black-box differential automata learning. In Proc. of the 2016 ACM SIGSAC Conf. on Computer and Communications Security, pages 1690-1701. ACM, 2016.
G. Argyros, I. Stais, A. Kiayias, and A. D. Keromytis. Back in black: Towards formal, black box analysis of sanitizers and filters. In IEEE Symposium on Security and Privacy, SP, pages 91-109, 2016.
A. Beimel and E. Kushilevitz. Learning boxes in high dimension. Algorithmica, 22(1/2):76-90, 1998.
A. Beimel and E. Kushilevitz. Learning unions of high-dimensional boxes over the reals. Inf. Process. Lett., 73(5-6):213-220, 2000.
T. Berg, O. Grinchtein, B. Jonsson, M. Leucker, H. Raffelt, and B. Steffen. On the correspondence between conformance testing and regular inference. In Fundamental Approaches to Software Engineering, 8th Int. Conf., FASE, volume 3442, pages 175-189. Springer, 2005.
F. Bergadano and S. Varricchio. Learning behaviors of automata from multiplicity and equivalence queries. SIAM J. Comput., 25(6):1268-1280, 1996.
B. Bollig, P. Habermehl, C. Kern, and M. Leucker. Angluin-style learning of NFA. In IJCAI, pages 1004-1009, 2009.
R. E. Bryant. Graph-based algorithms for boolean function manipulation. IEEE Trans. Computers, 35(8):677-691, 1986.
N. H. Bshouty, P. W. Goldberg, S. A. Goldman, and H. D. Mathias. Exact learning of discretized geometric concepts. SIAM J. Comput., 28(2):674-699, 1998.
K. Chubachi, Diptarama, R. Yoshinaka, and A. Shinohara. Query learning of regular languages over large ordered alphabets. In 1st Workshop on Learning and Automata (LearnAut), 2017.
K. Chubachi, D. Hendrian, R. Yoshinaka, and A. Shinohara. Query learning algorithm for residual symbolic finite automata. In Proc. Tenth Int. Symposium on Games, Automata, Logics, and Formal Verification, GandALF, pages 140-153, 2019.
E. M. Clarke, O. Grumberg, and D. A. Peled. Model Checking. MIT Press, 2001.
L. D'Antoni and M. Veanes. Minimization of symbolic tree automata. In Proc. of the 31st Annual ACM/IEEE Symposium on Logic in Computer Science, LICS, pages 873-882. ACM, 2016.
L. D'Antoni, M. Veanes, B. Livshits, and D. Molnar. Fast: a transducer-based language for tree manipulation. In ACM SIGPLAN Conf. on Programming Language Design and Implementation, PLDI, pages 384-394. ACM, 2014.
C. de la Higuera. Characteristic sets for polynomial grammatical inference. Machine Learning, 27(2):125-138, 1997.
S. Drews and L. D'Antoni. Learning symbolic automata. In Tools and Algorithms for the Construction and Analysis of Systems - 23rd Int. Conf., TACAS, pages 173-189, 2017.
D. Fisman. Inferring regular languages and ω-languages. J. Log. Algebraic Methods Program., 98:27-49, 2018.
D. Fisman, H. Frenkel, and S. Zilles. On the complexity of symbolic finite-state automata, 2021. URL: http://arxiv.org/abs/2011.05389.
E. M. Gold. Complexity of automaton identification from given data. Inf. Control., 37(3):302-320, 1978.
P. W. Goldberg, S. A. Goldman, and H. D. Mathias. Learning unions of boxes with membership and equivalence queries. In Proc. of the Seventh Annual ACM Conf. on Computational Learning Theory, COLT, pages 198-207. ACM, 1994.
S. A. Goldman and H. D. Mathias. Teaching a smarter learner. J. Comput. Syst. Sci., 52(2):255-267, 1996.
O. Grinchtein, B. Jonsson, and M. Leucker. Learning of event-recording automata. Theor. Comput. Sci., 411(47):4029-4054, 2010.
P. Hooimeijer, B. Livshits, D. Molnar, P. Saxena, and M. Veanes. Fast and precise sanitizer analysis with BEK. In 20th USENIX Security Symposium, San Francisco, CA, USA, August 8-12, 2011, Proc. Association, 2011.
F. Howar, B. Steffen, and M. Merten. Automata learning with automated alphabet abstraction refinement. In Verification, Model Checking, and Abstract Interpretation - 12th Int. Conf., VMCAI, volume 6538 of LNCS, pages 263-277. Springer, 2011.
Q. Hu and L. D'Antoni. Automatic program inversion using symbolic transducers. In Proc. of the 38th ACM SIGPLAN Conf. on Programming Language Design and Implementation, PLDI, pages 376-389. ACM, 2017.
M. Keil and P. Thiemann. Symbolic solving of extended regular expression inequalities. In 34th Int. Conf. on Foundation of Software Technology and Theoretical Computer Science, FSTTCS, pages 175-186, 2014.
O. Maler and I. Mens. Learning regular languages over large alphabets. In Tools and Algorithms for the Construction and Analysis of Systems - 20th Int. Conf., TACAS, volume 8413 of LNCS, pages 485-499. Springer, 2014.
O. Maler and I. Mens. A generic algorithm for learning symbolic automata from membership queries. In Models, Algorithms, Logics and Tools - Essays Dedicated to Kim Guldstrand Larsen on the Occasion of His 60th Birthday, pages 146-169, 2017.
O. Maler and A. Pnueli. On the learnability of infinitary regular sets. Inf. Comput., 118(2):316-326, 1995.
K. Mamouras, M. Raghothaman, R. Alur, Z. G. Ives, and S. Khanna. StreamQRE: modular specification and efficient evaluation of quantitative queries over streaming data. In Proc. of the 38th ACM SIGPLAN Conf. on Prog. Lang. Design and Impl., PLDI, pages 693-708, 2017.
A. Nakamura. Query learning of bounded-width obdds. Theor. Comput. Sci., 241(1-2):83-114, 2000.
D. Nitay, D. Fisman, and M. Ziv-Ukelson. Learning of structurally unambiguous probabilistic grammars. In Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, pages 9170-9178, 2021. URL: https://ojs.aaai.org/index.php/AAAI/article/view/17107.
J. Oncina and P. García. Inferring regular languages in polynomial update time. World Scientific, January 1992. URL: https://doi.org/10.1142/9789812797902_0004.
L. Pitt. Inductive inference, DFAs, and computational complexity. In Proc. Int. Workshop on Analogical and Inductive Inference, pages 18-44, 1989.
M. D. Preda, R. Giacobazzi, A. Lakhotia, and I. Mastroeni. Abstract symbolic automata: Mixed syntactic/semantic similarity analysis of executables. In Proc. of the 42nd Annual ACM SIGPLAN-SIGACT Symp. on Princ. of Prog. Lang., POPL, pages 329-341, 2015.
O. Saarikivi and M. Veanes. Translating c# to branching symbolic transducers. In IWIL@LPAR 2017 Workshop and LPAR-21 Short Presentations, volume 1 of Kalpa Publications in Computing. EasyChair, 2017.
Y. Sakakibara. Learning context-free grammars from structural data in polynomial time. Theor. Comput. Sci., 76(2-3):223-242, 1990.
S. Sheinvald. Learning deterministic variable automata over infinite alphabets. In Formal Methods - The Next 30 Years - Third World Congress, FM, volume 11800 of LNCS, pages 633-650. Springer, 2019.
Frits W. Vaandrager. Model learning. Commun. ACM, 60(2):86-95, 2017. URL: https://doi.org/10.1145/2967606.
M. Veanes, P. de Halleux, and N. Tillmann. Rex: Symbolic regular expression explorer. In Third Int. Comf, on Software Testing, Verification and Validation, ICST, pages 498-507. IEEE Computer Society, 2010.
X. Zhu, A. Singla, S. Zilles, and A. N. Rafferty. An overview of machine teaching. CoRR ArXiv, abs/1801.05927, 2018. URL: http://arxiv.org/abs/1801.05927.

Inferring Symbolic Automata

Authors Dana Fisman, Hadar Frenkel, Sandra Zilles

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message

Inferring Symbolic Automata

Authors Dana Fisman, Hadar Frenkel, Sandra Zilles

File

Document Identifiers

Related Versions

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Funding

References

Thanks for your feedback!

Could not send message