eng
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Leibniz International Proceedings in Informatics
1868-8969
2023-06-21
12:1
12:22
10.4230/LIPIcs.CPM.2023.12
article
Compressed Indexing for Consecutive Occurrences
Gawrychowski, Paweł
1
Gourdel, Garance
2
Starikovskaya, Tatiana
3
Steiner, Teresa Anna
4
Institute of Computer Science, University of Wrocław, Poland
DI/ENS, PSL Research University, IRISA Inria Rennes, France
DI/ENS, PSL Research University, Paris, France
DTU Compute, Technical University of Denmark, Lyngby, Denmark
The fundamental question considered in algorithms on strings is that of indexing, that is, preprocessing a given string for specific queries. By now we have a number of efficient solutions for this problem when the queries ask for an exact occurrence of a given pattern P. However, practical applications motivate the necessity of considering more complex queries, for example concerning near occurrences of two patterns. Recently, Bille et al. [CPM 2021] introduced a variant of such queries, called gapped consecutive occurrences, in which a query consists of two patterns P₁ and P₂ and a range [a,b], and one must find all consecutive occurrences (q₁,q₂) of P₁ and P₂ such that q₂-q₁ ∈ [a,b]. By their results, we cannot hope for a very efficient indexing structure for such queries, even if a = 0 is fixed (although at the same time they provided a non-trivial upper bound). Motivated by this, we focus on a text given as a straight-line program (SLP) and design an index taking space polynomial in the size of the grammar that answers such queries in time optimal up to polylog factors.
https://drops.dagstuhl.de/storage/00lipics/lipics-vol259-cpm2023/LIPIcs.CPM.2023.12/LIPIcs.CPM.2023.12.pdf
Compressed indexing
two patterns
consecutive occurrences