Inferring Symbolic Automata

Authors Dana Fisman, Hadar Frenkel, Sandra Zilles

Thumbnail PDF


  • Filesize: 1.48 MB
  • 19 pages

Document Identifiers

Author Details

Dana Fisman
  • Ben-Gurion University, Be’er Sheva, Israel
Hadar Frenkel
  • CISPA Helmholtz Center for Information Security, Saarbrücken, Germany
Sandra Zilles
  • University of Regina, Canada

Cite AsGet BibTex

Dana Fisman, Hadar Frenkel, and Sandra Zilles. Inferring Symbolic Automata. In 30th EACSL Annual Conference on Computer Science Logic (CSL 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 216, pp. 21:1-21:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


We study the learnability of symbolic finite state automata, a model shown useful in many applications in software verification. The state-of-the-art literature on this topic follows the query learning paradigm, and so far all obtained results are positive. We provide a necessary condition for efficient learnability of SFAs in this paradigm, from which we obtain the first negative result. The main focus of our work lies in the learnability of SFAs under the paradigm of identification in the limit using polynomial time and data. We provide a necessary condition and a sufficient condition for efficient learnability of SFAs in this paradigm, from which we derive a positive and a negative result.

Subject Classification

ACM Subject Classification
  • Theory of computation → Regular languages
  • Theory of computation → Formal languages and automata theory
  • Theory of computation → Models of computation
  • Computing methodologies → Machine learning
  • Symbolic Finite State Automata
  • Query Learning
  • Characteristic Sets


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. F. Aarts and F. Vaandrager. Learning I/O automata. In Concurrency Theory, 21th Int. Conf., CONCUR 2010, pages 71-85, 2010. Google Scholar
  2. R. Alur, L. Fix, and T. A. Henzinger. Event-clock automata: A determinizable class of timed automata. Theor. Comput. Sci., 211(1-2):253-273, 1999. Google Scholar
  3. D. Angluin. A note on the number of queries needed to identify regular languages. Inf. Control., 51(1):76-87, 1981. Google Scholar
  4. D. Angluin. Learning regular sets from queries and counterexamples. Inf. Comput., 75(2):87-106, 1987. Google Scholar
  5. D. Angluin. Queries and concept learning. Machine Learning, 2(4):319-342, 1988. Google Scholar
  6. D. Angluin. Negative results for equivalence queries. Mach. Learn., 5:121-150, 1990. Google Scholar
  7. D. Angluin, S. Eisenstat, and D. Fisman. Learning regular languages via alternating automata. In Proc. of the Twenty-Fourth Int. Joint Conf. on Artificial Intelligence, IJCAI, pages 3308-3314, 2015. Google Scholar
  8. D. Angluin and D. Fisman. Learning regular omega languages. Theor. Comput. Sci., 650:57-72, 2016. Google Scholar
  9. G. Argyros and L. D'Antoni. The learnability of symbolic automata. In Computer Aided Verification - 30th Int. Conf., CAV, volume 10981 of LNCS, pages 427-445. Springer, 2018. Google Scholar
  10. G. Argyros, I. Stais, S. Jana, A. D. Keromytis, and A. Kiayias. Sfadiff: Automated evasion attacks and fingerprinting using black-box differential automata learning. In Proc. of the 2016 ACM SIGSAC Conf. on Computer and Communications Security, pages 1690-1701. ACM, 2016. Google Scholar
  11. G. Argyros, I. Stais, A. Kiayias, and A. D. Keromytis. Back in black: Towards formal, black box analysis of sanitizers and filters. In IEEE Symposium on Security and Privacy, SP, pages 91-109, 2016. Google Scholar
  12. A. Beimel and E. Kushilevitz. Learning boxes in high dimension. Algorithmica, 22(1/2):76-90, 1998. Google Scholar
  13. A. Beimel and E. Kushilevitz. Learning unions of high-dimensional boxes over the reals. Inf. Process. Lett., 73(5-6):213-220, 2000. Google Scholar
  14. T. Berg, O. Grinchtein, B. Jonsson, M. Leucker, H. Raffelt, and B. Steffen. On the correspondence between conformance testing and regular inference. In Fundamental Approaches to Software Engineering, 8th Int. Conf., FASE, volume 3442, pages 175-189. Springer, 2005. Google Scholar
  15. F. Bergadano and S. Varricchio. Learning behaviors of automata from multiplicity and equivalence queries. SIAM J. Comput., 25(6):1268-1280, 1996. Google Scholar
  16. B. Bollig, P. Habermehl, C. Kern, and M. Leucker. Angluin-style learning of NFA. In IJCAI, pages 1004-1009, 2009. Google Scholar
  17. R. E. Bryant. Graph-based algorithms for boolean function manipulation. IEEE Trans. Computers, 35(8):677-691, 1986. Google Scholar
  18. N. H. Bshouty, P. W. Goldberg, S. A. Goldman, and H. D. Mathias. Exact learning of discretized geometric concepts. SIAM J. Comput., 28(2):674-699, 1998. Google Scholar
  19. K. Chubachi, Diptarama, R. Yoshinaka, and A. Shinohara. Query learning of regular languages over large ordered alphabets. In 1st Workshop on Learning and Automata (LearnAut), 2017. Google Scholar
  20. K. Chubachi, D. Hendrian, R. Yoshinaka, and A. Shinohara. Query learning algorithm for residual symbolic finite automata. In Proc. Tenth Int. Symposium on Games, Automata, Logics, and Formal Verification, GandALF, pages 140-153, 2019. Google Scholar
  21. E. M. Clarke, O. Grumberg, and D. A. Peled. Model Checking. MIT Press, 2001. Google Scholar
  22. L. D'Antoni and M. Veanes. Minimization of symbolic tree automata. In Proc. of the 31st Annual ACM/IEEE Symposium on Logic in Computer Science, LICS, pages 873-882. ACM, 2016. Google Scholar
  23. L. D'Antoni, M. Veanes, B. Livshits, and D. Molnar. Fast: a transducer-based language for tree manipulation. In ACM SIGPLAN Conf. on Programming Language Design and Implementation, PLDI, pages 384-394. ACM, 2014. Google Scholar
  24. C. de la Higuera. Characteristic sets for polynomial grammatical inference. Machine Learning, 27(2):125-138, 1997. Google Scholar
  25. S. Drews and L. D'Antoni. Learning symbolic automata. In Tools and Algorithms for the Construction and Analysis of Systems - 23rd Int. Conf., TACAS, pages 173-189, 2017. Google Scholar
  26. D. Fisman. Inferring regular languages and ω-languages. J. Log. Algebraic Methods Program., 98:27-49, 2018. Google Scholar
  27. D. Fisman, H. Frenkel, and S. Zilles. On the complexity of symbolic finite-state automata, 2021. URL:
  28. E. M. Gold. Complexity of automaton identification from given data. Inf. Control., 37(3):302-320, 1978. Google Scholar
  29. P. W. Goldberg, S. A. Goldman, and H. D. Mathias. Learning unions of boxes with membership and equivalence queries. In Proc. of the Seventh Annual ACM Conf. on Computational Learning Theory, COLT, pages 198-207. ACM, 1994. Google Scholar
  30. S. A. Goldman and H. D. Mathias. Teaching a smarter learner. J. Comput. Syst. Sci., 52(2):255-267, 1996. Google Scholar
  31. O. Grinchtein, B. Jonsson, and M. Leucker. Learning of event-recording automata. Theor. Comput. Sci., 411(47):4029-4054, 2010. Google Scholar
  32. P. Hooimeijer, B. Livshits, D. Molnar, P. Saxena, and M. Veanes. Fast and precise sanitizer analysis with BEK. In 20th USENIX Security Symposium, San Francisco, CA, USA, August 8-12, 2011, Proc. Association, 2011. Google Scholar
  33. F. Howar, B. Steffen, and M. Merten. Automata learning with automated alphabet abstraction refinement. In Verification, Model Checking, and Abstract Interpretation - 12th Int. Conf., VMCAI, volume 6538 of LNCS, pages 263-277. Springer, 2011. Google Scholar
  34. Q. Hu and L. D'Antoni. Automatic program inversion using symbolic transducers. In Proc. of the 38th ACM SIGPLAN Conf. on Programming Language Design and Implementation, PLDI, pages 376-389. ACM, 2017. Google Scholar
  35. M. Keil and P. Thiemann. Symbolic solving of extended regular expression inequalities. In 34th Int. Conf. on Foundation of Software Technology and Theoretical Computer Science, FSTTCS, pages 175-186, 2014. Google Scholar
  36. O. Maler and I. Mens. Learning regular languages over large alphabets. In Tools and Algorithms for the Construction and Analysis of Systems - 20th Int. Conf., TACAS, volume 8413 of LNCS, pages 485-499. Springer, 2014. Google Scholar
  37. O. Maler and I. Mens. A generic algorithm for learning symbolic automata from membership queries. In Models, Algorithms, Logics and Tools - Essays Dedicated to Kim Guldstrand Larsen on the Occasion of His 60th Birthday, pages 146-169, 2017. Google Scholar
  38. O. Maler and A. Pnueli. On the learnability of infinitary regular sets. Inf. Comput., 118(2):316-326, 1995. Google Scholar
  39. K. Mamouras, M. Raghothaman, R. Alur, Z. G. Ives, and S. Khanna. StreamQRE: modular specification and efficient evaluation of quantitative queries over streaming data. In Proc. of the 38th ACM SIGPLAN Conf. on Prog. Lang. Design and Impl., PLDI, pages 693-708, 2017. Google Scholar
  40. A. Nakamura. Query learning of bounded-width obdds. Theor. Comput. Sci., 241(1-2):83-114, 2000. Google Scholar
  41. D. Nitay, D. Fisman, and M. Ziv-Ukelson. Learning of structurally unambiguous probabilistic grammars. In Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, pages 9170-9178, 2021. URL:
  42. J. Oncina and P. García. Inferring regular languages in polynomial update time. World Scientific, January 1992. URL:
  43. L. Pitt. Inductive inference, DFAs, and computational complexity. In Proc. Int. Workshop on Analogical and Inductive Inference, pages 18-44, 1989. Google Scholar
  44. M. D. Preda, R. Giacobazzi, A. Lakhotia, and I. Mastroeni. Abstract symbolic automata: Mixed syntactic/semantic similarity analysis of executables. In Proc. of the 42nd Annual ACM SIGPLAN-SIGACT Symp. on Princ. of Prog. Lang., POPL, pages 329-341, 2015. Google Scholar
  45. O. Saarikivi and M. Veanes. Translating c# to branching symbolic transducers. In IWIL@LPAR 2017 Workshop and LPAR-21 Short Presentations, volume 1 of Kalpa Publications in Computing. EasyChair, 2017. Google Scholar
  46. Y. Sakakibara. Learning context-free grammars from structural data in polynomial time. Theor. Comput. Sci., 76(2-3):223-242, 1990. Google Scholar
  47. S. Sheinvald. Learning deterministic variable automata over infinite alphabets. In Formal Methods - The Next 30 Years - Third World Congress, FM, volume 11800 of LNCS, pages 633-650. Springer, 2019. Google Scholar
  48. Frits W. Vaandrager. Model learning. Commun. ACM, 60(2):86-95, 2017. URL:
  49. M. Veanes, P. de Halleux, and N. Tillmann. Rex: Symbolic regular expression explorer. In Third Int. Comf, on Software Testing, Verification and Validation, ICST, pages 498-507. IEEE Computer Society, 2010. Google Scholar
  50. X. Zhu, A. Singla, S. Zilles, and A. N. Rafferty. An overview of machine teaching. CoRR ArXiv, abs/1801.05927, 2018. URL: