LIPIcs.ESA.2020.32.pdf
- Filesize: 0.6 MB
- 18 pages
The notions of periodicity and repetitions in strings, and hence these of runs and squares, naturally extend to two-dimensional strings. We consider two types of repetitions in 2D-strings: 2D-runs and quartics (quartics are a 2D-version of squares in standard strings). Amir et al. introduced 2D-runs, showed that there are 𝒪(n³) of them in an n × n 2D-string and presented a simple construction giving a lower bound of Ω(n²) for their number (Theoretical Computer Science, 2020). We make a significant step towards closing the gap between these bounds by showing that the number of 2D-runs in an n × n 2D-string is 𝒪(n² log² n). In particular, our bound implies that the 𝒪(n²log n + output) run-time of the algorithm of Amir et al. for computing 2D-runs is also 𝒪(n² log² n). We expect this result to allow for exploiting 2D-runs algorithmically in the area of 2D pattern matching. A quartic is a 2D-string composed of 2 × 2 identical blocks (2D-strings) that was introduced by Apostolico and Brimkov (Theoretical Computer Science, 2000), where by quartics they meant only primitively rooted quartics, i.e. built of a primitive block. Here our notion of quartics is more general and analogous to that of squares in 1D-strings. Apostolico and Brimkov showed that there are 𝒪(n² log² n) occurrences of primitively rooted quartics in an n × n 2D-string and that this bound is attainable. Consequently the number of distinct primitively rooted quartics is 𝒪(n² log² n). The straightforward bound for the maximal number of distinct general quartics is 𝒪(n⁴). Here, we prove that the number of distinct general quartics is also 𝒪(n² log² n). This extends the rich combinatorial study of the number of distinct squares in a 1D-string, that was initiated by Fraenkel and Simpson (Journal of Combinatorial Theory, Series A, 1998), to two dimensions. Finally, we show some algorithmic applications of 2D-runs. Specifically, we present algorithms for computing all occurrences of primitively rooted quartics and counting all general distinct quartics in 𝒪(n² log² n) time, which is quasi-linear with respect to the size of the input. The former algorithm is optimal due to the lower bound of Apostolico and Brimkov. The latter can be seen as a continuation of works on enumeration of distinct squares in 1D-strings using runs (Crochemore et al., Theoretical Computer Science, 2014). However, the methods used in 2D are different because of different properties of 2D-runs and quartics.
Feedback for Dagstuhl Publishing