,
Hendrik Maarand
,
Margus Veanes
,
Gabriel Ebner
Creative Commons Attribution 4.0 International license
Brzozowski proved that the set of derivatives of any regular expression is finite modulo associativity, idempotence and, notably, commutativity of the union operator. We extend this result to the case of symbolic location based derivatives, for which we prove finiteness of the state space by quotienting only by associativity, deduplication and idempotence (ADI); the fact that we don't use commutativity allows for this result to carry over to the derivative based backtracking (PCRE) match semantics, where the union operator is noncommutative. Furthermore, we consider regular expressions extended with lookarounds, intersection, and negation. We also show that our method for proving finiteness allows us to include certain simplification rules in the derivative operation while preserving finiteness. The finiteness proof is constructive: given an expression R, we construct a finite set that is an overapproximation (modulo ADI) of the set of derivatives of R. We reuse some of the infrastructure provided in previous formalization efforts for regular expressions in Lean 4, showing the flexibility and reusability of the framework.
@InProceedings{zhuchko_et_al:LIPIcs.ITP.2025.16,
author = {Zhuchko, Ekaterina and Maarand, Hendrik and Veanes, Margus and Ebner, Gabriel},
title = {{Finiteness of Symbolic Derivatives in Lean}},
booktitle = {16th International Conference on Interactive Theorem Proving (ITP 2025)},
pages = {16:1--16:19},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-396-6},
ISSN = {1868-8969},
year = {2025},
volume = {352},
editor = {Forster, Yannick and Keller, Chantal},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ITP.2025.16},
URN = {urn:nbn:de:0030-drops-246144},
doi = {10.4230/LIPIcs.ITP.2025.16},
annote = {Keywords: Lean, regular languages, lookarounds, derivatives, finiteness}
}
archived version