LIPIcs.CPM.2023.25.pdf
- Filesize: 0.76 MB
- 17 pages
In order to use them for compression, we extend L-systems (without ε-rules) with two parameters d and n, and also a coding τ, which determines unambiguously a string w = τ(φ^d(s))[1:n], where φ is the morphism of the system, and s is its axiom. The length of the shortest description of an L-system generating w is known as 𝓁, and it is arguably a relevant measure of repetitiveness that builds on the self-similarities that arise in the sequence. In this paper, we deepen the study of the measure 𝓁 and its relation with a better-established measure called δ, which builds on substring complexity. Our results show that 𝓁 and δ are largely orthogonal, in the sense that one can be much larger than the other, depending on the case. This suggests that both mechanisms capture different kinds of regularities related to repetitiveness. We then show that the recently introduced NU-systems, which combine the capabilities of L-systems with bidirectional macro schemes, can be asymptotically strictly smaller than both mechanisms for the same fixed string family, which makes the size ν of the smallest NU-system the unique smallest reachable repetitiveness measure to date. We conclude that in order to achieve better compression, we should combine morphism substitution with copy-paste mechanisms.
Feedback for Dagstuhl Publishing