Search Results

Documents authored by Terauchi, Tachio


Document
On the Expressive Power of Regular Expressions with Backreferences

Authors: Taisei Nogami and Tachio Terauchi

Published in: LIPIcs, Volume 272, 48th International Symposium on Mathematical Foundations of Computer Science (MFCS 2023)


Abstract
A rewb is a regular expression extended with a feature called backreference. It is broadly known that backreference is a practical extension of regular expressions, and is supported by most modern regular expression engines, such as those in the standard libraries of Java, Python, and more. Meanwhile, indexed languages are the languages generated by indexed grammars, a formal grammar class proposed by A.V.Aho. We show that these two models' expressive powers are related in the following way: every language described by a rewb is an indexed language. As the smallest formal grammar class previously known to contain rewbs is the class of context sensitive languages, our result strictly improves the known upper-bound. Moreover, we prove the following two claims: there exists a rewb whose language does not belong to the class of stack languages, which is a proper subclass of indexed languages, and the language described by a rewb without a captured reference is in the class of nonerasing stack languages, which is a proper subclass of stack languages. Finally, we show that the hierarchy investigated in a prior study, which separates the expressive power of rewbs by the notion of nested levels, is within the class of nonerasing stack languages.

Cite as

Taisei Nogami and Tachio Terauchi. On the Expressive Power of Regular Expressions with Backreferences. In 48th International Symposium on Mathematical Foundations of Computer Science (MFCS 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 272, pp. 71:1-71:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{nogami_et_al:LIPIcs.MFCS.2023.71,
  author =	{Nogami, Taisei and Terauchi, Tachio},
  title =	{{On the Expressive Power of Regular Expressions with Backreferences}},
  booktitle =	{48th International Symposium on Mathematical Foundations of Computer Science (MFCS 2023)},
  pages =	{71:1--71:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-292-1},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{272},
  editor =	{Leroux, J\'{e}r\^{o}me and Lombardy, Sylvain and Peleg, David},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.MFCS.2023.71},
  URN =		{urn:nbn:de:0030-drops-186055},
  doi =		{10.4230/LIPIcs.MFCS.2023.71},
  annote =	{Keywords: Regular expressions, Backreferences, Expressive power}
}
Document
On Lookaheads in Regular Expressions with Backreferences

Authors: Nariyoshi Chida and Tachio Terauchi

Published in: LIPIcs, Volume 228, 7th International Conference on Formal Structures for Computation and Deduction (FSCD 2022)


Abstract
Many modern regular expression engines employ various extensions to give more expressive support for real-world usages. Among the major extensions employed by many of the modern regular expression engines are backreferences and lookaheads. A question of interest about these extended regular expressions is their expressive power. Previous works have shown that (i) the extension by lookaheads does not enhance the expressive power, i.e., the expressive power of regular expressions with lookaheads is still regular, and that (ii) the extension by backreferences enhances the expressive power, i.e., the expressive power of regular expressions with backreferences (abbreviated as rewb) is no longer regular. This raises the following natural question: Does the extension of regular expressions with backreferences by lookaheads enhance the expressive power of regular expressions with backreferences? This paper answers the question positively by proving that adding either positive lookaheads or negative lookaheads increases the expressive power of rewb (the former abbreviated as rewbl_p and the latter as rewbl_n). A consequence of our result is that neither the class of finite state automata nor that of memory automata (MFA) of Schmid [Markus L. Schmid, 2016] (which corresponds to regular expressions with backreferenes but without lookaheads) corresponds to rewbl_p or rewbl_n. To fill the void, as a first step toward building such automata, we propose a new class of automata called memory automata with positive lookaheads (PLMFA) that corresponds to rewbl_p. The key idea of PLMFA is to extend MFA with a new kind of memories, called positive-lookahead memory, that is used to simulate the backtracking behavior of positive lookaheads. Interestingly, our positive-lookahead memories are almost perfectly symmetric to the capturing-group memories of MFA. Therefore, our PLMFA can be seen as a natural extension of MFA that can be obtained independently of its original intended purpose of simulating rewbl_p.

Cite as

Nariyoshi Chida and Tachio Terauchi. On Lookaheads in Regular Expressions with Backreferences. In 7th International Conference on Formal Structures for Computation and Deduction (FSCD 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 228, pp. 15:1-15:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{chida_et_al:LIPIcs.FSCD.2022.15,
  author =	{Chida, Nariyoshi and Terauchi, Tachio},
  title =	{{On Lookaheads in Regular Expressions with Backreferences}},
  booktitle =	{7th International Conference on Formal Structures for Computation and Deduction (FSCD 2022)},
  pages =	{15:1--15:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-233-4},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{228},
  editor =	{Felty, Amy P.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSCD.2022.15},
  URN =		{urn:nbn:de:0030-drops-162965},
  doi =		{10.4230/LIPIcs.FSCD.2022.15},
  annote =	{Keywords: Regular expressions, Lookaheads, Backreferences, Memory automata}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail