55 Search Results for "Rytter, Wojciech"


Volume

LIPIcs, Volume 78

28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017)

CPM 2017, July 4-6, 2017, Warsaw, Poland

Editors: Juha Kärkkäinen, Jakub Radoszewski, and Wojciech Rytter

Document
Efficient Exact Online String Matching Through Linked Weak Factors

Authors: Matthew N. Palmer, Simone Faro, and Stefano Scafiti

Published in: LIPIcs, Volume 301, 22nd International Symposium on Experimental Algorithms (SEA 2024)


Abstract
Online exact string matching is a fundamental computational problem in computer science, involving the sequential search for a pattern within a large text without prior access to the entire text. Its significance is underscored by its diverse applications in data compression, data mining, text editing, and bioinformatics, just to cite a few, where efficient substring matching is crucial. While the problem has been a subject of study for years, recent decades have witnessed a heightened focus on experimental solutions, employing various techniques to achieve superior performance. Notably, approaches centered around weak factor recognition have emerged as leaders in experimental settings, gaining increasing attention. This paper introduces Hash Chain, a novel algorithm founded on a robust weak factor recognition approach that links adjacent factors through hashing. Building upon the efficacy of weak recognition techniques, the proposed algorithm incorporates innovative strategies for organizing data structures and optimizations to enhance performance. Despite its quadratic worst-case time complexity, the new proposed algorithm demonstrates sublinear behavior in practice, outperforming currently known algorithms in the literature.

Cite as

Matthew N. Palmer, Simone Faro, and Stefano Scafiti. Efficient Exact Online String Matching Through Linked Weak Factors. In 22nd International Symposium on Experimental Algorithms (SEA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 301, pp. 24:1-24:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{palmer_et_al:LIPIcs.SEA.2024.24,
  author =	{Palmer, Matthew N. and Faro, Simone and Scafiti, Stefano},
  title =	{{Efficient Exact Online String Matching Through Linked Weak Factors}},
  booktitle =	{22nd International Symposium on Experimental Algorithms (SEA 2024)},
  pages =	{24:1--24:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-325-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{301},
  editor =	{Liberti, Leo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.24},
  URN =		{urn:nbn:de:0030-drops-203896},
  doi =		{10.4230/LIPIcs.SEA.2024.24},
  annote =	{Keywords: String matching, text processing, weak recognition, hashing, experimental algorithms, design and analysis of algorithms}
}
Document
Track A: Algorithms, Complexity and Games
Optimal Bounds for Distinct Quartics

Authors: Panagiotis Charalampopoulos, Paweł Gawrychowski, and Samah Ghazawi

Published in: LIPIcs, Volume 297, 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)


Abstract
A fundamental concept related to strings is that of repetitions. It has been extensively studied in many versions, from both purely combinatorial and algorithmic angles. One of the most basic questions is how many distinct squares, i.e., distinct strings of the form UU, a string of length n can contain as fragments. It turns out that this is always 𝒪(n), and the bound cannot be improved to sublinear in n [Fraenkel and Simpson, JCTA 1998]. Several similar questions about repetitions in strings have been considered, and by now we seem to have a good understanding of their repetitive structure. For higher-dimensional strings, the basic concept of periodicity has been successfully extended and applied to design efficient algorithms - it is inherently more complex than for regular strings. Extending the notion of repetitions and understanding the repetitive structure of higher-dimensional strings is however far from complete. Quartics were introduced by Apostolico and Brimkov [TCS 2000] as analogues of squares in two dimensions. Charalampopoulos, Radoszewski, Rytter, Waleń, and Zuba [ESA 2020] proved that the number of distinct quartics in an n×n 2D string is 𝒪(n²log²n) and that they can be computed in 𝒪(n²log²n) time. Gawrychowski, Ghazawi, and Landau [SPIRE 2021] constructed an infinite family of n×n 2D strings with Ω(n²log n) distinct quartics. This brings the challenge of determining asymptotically tight bounds. Here, we settle both the combinatorial and the algorithmic aspects of this question: the number of distinct quartics in an n×n 2D string is 𝒪(n²log n) and they can be computed in the worst-case optimal 𝒪(n²log n) time. As expected, our solution heavily exploits the periodic structure implied by occurrences of quartics. However, the two-dimensional nature of the problem introduces some technical challenges. Somewhat surprisingly, we overcome the final challenge for the combinatorial bound using a result of Marcus and Tardos [JCTA 2004] for permutation avoidance on matrices.

Cite as

Panagiotis Charalampopoulos, Paweł Gawrychowski, and Samah Ghazawi. Optimal Bounds for Distinct Quartics. In 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 297, pp. 39:1-39:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{charalampopoulos_et_al:LIPIcs.ICALP.2024.39,
  author =	{Charalampopoulos, Panagiotis and Gawrychowski, Pawe{\l} and Ghazawi, Samah},
  title =	{{Optimal Bounds for Distinct Quartics}},
  booktitle =	{51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)},
  pages =	{39:1--39:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-322-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{297},
  editor =	{Bringmann, Karl and Grohe, Martin and Puppis, Gabriele and Svensson, Ola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2024.39},
  URN =		{urn:nbn:de:0030-drops-201823},
  doi =		{10.4230/LIPIcs.ICALP.2024.39},
  annote =	{Keywords: 2D strings, quartics, repetitions, periodicity}
}
Document
Track A: Algorithms, Complexity and Games
Breaking a Barrier in Constructing Compact Indexes for Parameterized Pattern Matching

Authors: Kento Iseri, Tomohiro I, Diptarama Hendrian, Dominik Köppl, Ryo Yoshinaka, and Ayumi Shinohara

Published in: LIPIcs, Volume 297, 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)


Abstract
A parameterized string (p-string) is a string over an alphabet (Σ_s ∪ Σ_p), where Σ_s and Σ_p are disjoint alphabets for static symbols (s-symbols) and for parameter symbols (p-symbols), respectively. Two p-strings x and y are said to parameterized match (p-match) if and only if x can be transformed into y by applying a bijection on Σ_p to every occurrence of p-symbols in x. The indexing problem for p-matching is to preprocess a p-string T of length n so that we can efficiently find the occurrences of substrings of T that p-match with a given pattern. Let σ_s and respectively σ_p be the numbers of distinct s-symbols and p-symbols that appear in T and σ = σ_s + σ_p. Extending the Burrows-Wheeler Transform (BWT) based index for exact string pattern matching, Ganguly et al. [SODA 2017] proposed parameterized BWTs (pBWTs) to design the first compact index for p-matching, and posed an open problem on how to construct the pBWT-based index in compact space, i.e., in O(n lg |Σ_s ∪ Σ_p|) bits of space. Hashimoto et al. [SPIRE 2022] showed how to construct the pBWT for T, under the assumption that Σ_s ∪ Σ_p = [0..O(σ)], in O(n lg σ) bits of space and O(n (σ_p lg n)/(lg lg n)) time in an online manner while reading the symbols of T from right to left. In this paper, we refine Hashimoto et al.’s algorithm to work in O(n lg σ) bits of space and O(n (lg σ_p lg n)/(lg lg n)) time in a more general assumption that Σ_s ∪ Σ_p = [0..n^{O(1)}]. Our result has an immediate application to constructing parameterized suffix arrays in O(n (lg σ_p lg n)/(lg lg n)) time and O(n lg σ) bits of working space. We also show that our data structure can support backward search, a core procedure of BWT-based indexes, at any stage of the online construction, making it the first compact index for p-matching that can be constructed in compact space and even in an online manner.

Cite as

Kento Iseri, Tomohiro I, Diptarama Hendrian, Dominik Köppl, Ryo Yoshinaka, and Ayumi Shinohara. Breaking a Barrier in Constructing Compact Indexes for Parameterized Pattern Matching. In 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 297, pp. 89:1-89:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{iseri_et_al:LIPIcs.ICALP.2024.89,
  author =	{Iseri, Kento and I, Tomohiro and Hendrian, Diptarama and K\"{o}ppl, Dominik and Yoshinaka, Ryo and Shinohara, Ayumi},
  title =	{{Breaking a Barrier in Constructing Compact Indexes for Parameterized Pattern Matching}},
  booktitle =	{51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)},
  pages =	{89:1--89:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-322-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{297},
  editor =	{Bringmann, Karl and Grohe, Martin and Puppis, Gabriele and Svensson, Ola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2024.89},
  URN =		{urn:nbn:de:0030-drops-202324},
  doi =		{10.4230/LIPIcs.ICALP.2024.89},
  annote =	{Keywords: Index for parameterized pattern matching, Parameterized Burrows-Wheeler Transform, Online construction}
}
Document
Track B: Automata, Logic, Semantics, and Theory of Programming
The Structure of Trees in the Pushdown Hierarchy

Authors: Arnaud Carayol and Lucien Charamond

Published in: LIPIcs, Volume 297, 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)


Abstract
In this article, we investigate the structure of the trees in the pushdown hierarchy, a hierarchy of infinite graphs having a decidable MSO-theory. We show that a binary complete tree in the pushdown hierarchy must contain at least two different subtrees which are isomorphic. We extend this property to any tree with no leaves and with chains of unary vertices of bounded length. We provided two applications of this result. A first application in formal language theory, gives a simple argument to show that some languages are not deterministic higher-order indexed languages. A second application in number theory shows that the real numbers defined by deterministic higher-order pushdown automata are either rational or transcendental.

Cite as

Arnaud Carayol and Lucien Charamond. The Structure of Trees in the Pushdown Hierarchy. In 51st International Colloquium on Automata, Languages, and Programming (ICALP 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 297, pp. 131:1-131:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{carayol_et_al:LIPIcs.ICALP.2024.131,
  author =	{Carayol, Arnaud and Charamond, Lucien},
  title =	{{The Structure of Trees in the Pushdown Hierarchy}},
  booktitle =	{51st International Colloquium on Automata, Languages, and Programming (ICALP 2024)},
  pages =	{131:1--131:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-322-5},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{297},
  editor =	{Bringmann, Karl and Grohe, Martin and Puppis, Gabriele and Svensson, Ola},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2024.131},
  URN =		{urn:nbn:de:0030-drops-202749},
  doi =		{10.4230/LIPIcs.ICALP.2024.131},
  annote =	{Keywords: Pushdown hierarchy, Monadic second-order logic, Automatic numbers}
}
Document
Approximate Circular Pattern Matching Under Edit Distance

Authors: Panagiotis Charalampopoulos, Solon P. Pissis, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń, and Wiktor Zuba

Published in: LIPIcs, Volume 289, 41st International Symposium on Theoretical Aspects of Computer Science (STACS 2024)


Abstract
In the k-Edit Circular Pattern Matching (k-Edit CPM) problem, we are given a length-n text T, a length-m pattern P, and a positive integer threshold k, and we are to report all starting positions of the substrings of T that are at edit distance at most k from some cyclic rotation of P. In the decision version of the problem, we are to check if any such substring exists. Very recently, Charalampopoulos et al. [ESA 2022] presented 𝒪(nk²)-time and 𝒪(nk log³ k)-time solutions for the reporting and decision versions of k-Edit CPM, respectively. Here, we show that the reporting and decision versions of k-Edit CPM can be solved in 𝒪(n+(n/m) k⁶) time and 𝒪(n+(n/m) k⁵ log³ k) time, respectively, thus obtaining the first algorithms with a complexity of the type 𝒪(n+(n/m) poly(k)) for this problem. Notably, our algorithms run in 𝒪(n) time when m = Ω(k⁶) and are superior to the previous respective solutions when m = ω(k⁴). We provide a meta-algorithm that yields efficient algorithms in several other interesting settings, such as when the strings are given in a compressed form (as straight-line programs), when the strings are dynamic, or when we have a quantum computer. We obtain our solutions by exploiting the structure of approximate circular occurrences of P in T, when T is relatively short w.r.t. P. Roughly speaking, either the starting positions of approximate occurrences of rotations of P form 𝒪(k⁴) intervals that can be computed efficiently, or some rotation of P is almost periodic (is at a small edit distance from a string with small period). Dealing with the almost periodic case is the most technically demanding part of this work; we tackle it using properties of locked fragments (originating from [Cole and Hariharan, SICOMP 2002]).

Cite as

Panagiotis Charalampopoulos, Solon P. Pissis, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń, and Wiktor Zuba. Approximate Circular Pattern Matching Under Edit Distance. In 41st International Symposium on Theoretical Aspects of Computer Science (STACS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 289, pp. 24:1-24:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{charalampopoulos_et_al:LIPIcs.STACS.2024.24,
  author =	{Charalampopoulos, Panagiotis and Pissis, Solon P. and Radoszewski, Jakub and Rytter, Wojciech and Wale\'{n}, Tomasz and Zuba, Wiktor},
  title =	{{Approximate Circular Pattern Matching Under Edit Distance}},
  booktitle =	{41st International Symposium on Theoretical Aspects of Computer Science (STACS 2024)},
  pages =	{24:1--24:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-311-9},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{289},
  editor =	{Beyersdorff, Olaf and Kant\'{e}, Mamadou Moustapha and Kupferman, Orna and Lokshtanov, Daniel},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.STACS.2024.24},
  URN =		{urn:nbn:de:0030-drops-197346},
  doi =		{10.4230/LIPIcs.STACS.2024.24},
  annote =	{Keywords: circular pattern matching, approximate pattern matching, edit distance}
}
Document
Linear-Time Computation of Cyclic Roots and Cyclic Covers of a String

Authors: Costas S. Iliopoulos, Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń, and Wiktor Zuba

Published in: LIPIcs, Volume 259, 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)


Abstract
Cyclic versions of covers and roots of a string are considered in this paper. A prefix V of a string S is a cyclic root of S if S is a concatenation of cyclic rotations of V. A prefix V of S is a cyclic cover of S if the occurrences of the cyclic rotations of V cover all positions of S. We present 𝒪(n)-time algorithms computing all cyclic roots (using number-theoretic tools) and all cyclic covers (using tools related to seeds) of a length-n string over an integer alphabet. Our results improve upon 𝒪(n log log n) and 𝒪(n log n) time complexities of recent algorithms of Grossi et al. (WALCOM 2023) for the respective problems and provide novel approaches to the problems. As a by-product, we obtain an optimal data structure for Internal Circular Pattern Matching queries that generalize Internal Pattern Matching and Cyclic Equivalence queries of Kociumaka et al. (SODA 2015).

Cite as

Costas S. Iliopoulos, Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń, and Wiktor Zuba. Linear-Time Computation of Cyclic Roots and Cyclic Covers of a String. In 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 259, pp. 15:1-15:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{iliopoulos_et_al:LIPIcs.CPM.2023.15,
  author =	{Iliopoulos, Costas S. and Kociumaka, Tomasz and Radoszewski, Jakub and Rytter, Wojciech and Wale\'{n}, Tomasz and Zuba, Wiktor},
  title =	{{Linear-Time Computation of Cyclic Roots and Cyclic Covers of a String}},
  booktitle =	{34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)},
  pages =	{15:1--15:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-276-1},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{259},
  editor =	{Bulteau, Laurent and Lipt\'{a}k, Zsuzsanna},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2023.15},
  URN =		{urn:nbn:de:0030-drops-179697},
  doi =		{10.4230/LIPIcs.CPM.2023.15},
  annote =	{Keywords: cyclic cover, cyclic root, circular pattern matching, internal pattern matching}
}
Document
Approximate Circular Pattern Matching

Authors: Panagiotis Charalampopoulos, Tomasz Kociumaka, Jakub Radoszewski, Solon P. Pissis, Wojciech Rytter, Tomasz Waleń, and Wiktor Zuba

Published in: LIPIcs, Volume 244, 30th Annual European Symposium on Algorithms (ESA 2022)


Abstract
We investigate the complexity of approximate circular pattern matching (CPM, in short) under the Hamming and edit distance. Under each of these two basic metrics, we are given a length-n text T, a length-m pattern P, and a positive integer threshold k, and we are to report all starting positions (called occurrences) of fragments of T that are at distance at most k from some cyclic rotation of P. In the decision version of the problem, we are to check if there is any such occurrence. All previous results for approximate CPM were either average-case upper bounds or heuristics, with the exception of the work of Charalampopoulos et al. [CKP^+, JCSS'21], who considered only the Hamming distance. For the reporting version of the approximate CPM problem, under the Hamming distance we improve upon the main algorithm of [CKP^+, JCSS'21] from 𝒪(n+(n/m) ⋅ k⁴) to 𝒪(n+(n/m) ⋅ k³ log log k) time; for the edit distance, we give an 𝒪(nk²)-time algorithm. Notably, for the decision versions and wide parameter-ranges, we give algorithms whose complexities are almost identical to the state-of-the-art for standard (i.e., non-circular) approximate pattern matching: - For the decision version of the approximate CPM problem under the Hamming distance, we obtain an 𝒪(n+(n/m) ⋅ k² log k / log log k)-time algorithm, which works in 𝒪(n) time whenever k = 𝒪(√{m log log m / log m}). In comparison, the fastest algorithm for the standard counterpart of the problem, by Chan et al. [CGKKP, STOC’20], runs in 𝒪(n) time only for k = 𝒪(√m). We achieve this result via a reduction to a geometric problem by building on ideas from [CKP^+, JCSS'21] and Charalampopoulos et al. [CKW, FOCS'20]. - For the decision version of the approximate CPM problem under the edit distance, the 𝒪(nklog³ k) runtime of our algorithm near matches the 𝒪(nk) runtime of the Landau-Vishkin algorithm [LV, J. Algorithms'89] for approximate pattern matching under edit distance; the latter algorithm remains the fastest known for k = Ω(m^{2/5}). As a stepping stone, we propose an 𝒪(nklog³ k)-time algorithm for solving the Longest Prefix k'-Approximate Match problem, proposed by Landau et al. [LMS, SICOMP'98], for all k' ∈ {1,…,k}. Our algorithm is based on Tiskin’s theory of seaweeds [Tiskin, Math. Comput. Sci.'08], with recent advancements (see Charalampopoulos et al. [CKW, FOCS'22]), and on exploiting the seaweeds' relation to Monge matrices. In contrast, we obtain a conditional lower bound that suggests a polynomial separation between approximate CPM under the Hamming distance over the binary alphabet and its non-circular counterpart. We also show that a strongly subquadratic-time algorithm for the decision version of approximate CPM under edit distance would refute the Strong Exponential Time Hypothesis.

Cite as

Panagiotis Charalampopoulos, Tomasz Kociumaka, Jakub Radoszewski, Solon P. Pissis, Wojciech Rytter, Tomasz Waleń, and Wiktor Zuba. Approximate Circular Pattern Matching. In 30th Annual European Symposium on Algorithms (ESA 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 244, pp. 35:1-35:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{charalampopoulos_et_al:LIPIcs.ESA.2022.35,
  author =	{Charalampopoulos, Panagiotis and Kociumaka, Tomasz and Radoszewski, Jakub and Pissis, Solon P. and Rytter, Wojciech and Wale\'{n}, Tomasz and Zuba, Wiktor},
  title =	{{Approximate Circular Pattern Matching}},
  booktitle =	{30th Annual European Symposium on Algorithms (ESA 2022)},
  pages =	{35:1--35:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-247-1},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{244},
  editor =	{Chechik, Shiri and Navarro, Gonzalo and Rotenberg, Eva and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2022.35},
  URN =		{urn:nbn:de:0030-drops-169738},
  doi =		{10.4230/LIPIcs.ESA.2022.35},
  annote =	{Keywords: approximate circular pattern matching, Hamming distance, edit distance}
}
Document
Linear-Time Computation of Shortest Covers of All Rotations of a String

Authors: Maxime Crochemore, Costas S. Iliopoulos, Jakub Radoszewski, Wojciech Rytter, Juliusz Straszyński, Tomasz Waleń, and Wiktor Zuba

Published in: LIPIcs, Volume 223, 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)


Abstract
We show that lengths of shortest covers of all rotations of a length-n string over an integer alphabet can be computed in 𝒪(n) time in the word-RAM model, thus improving an 𝒪(n log n)-time algorithm from Crochemore et al. (Theor. Comput. Sci., 2021). Similarly as Crochemore et al., we use a relation of covers of rotations of a string S to seeds and squares in S³. The crucial parameter of a string S is the number ξ(S) of primitive covers of all rotations of S. We show first that the time complexity of the algorithm from Crochemore et al. can be slightly improved which results in time complexity Θ(ξ(S)). However, we also show that in the worst case ξ(S) is Ω(|S|log |S|). This is the main difficulty in obtaining a linear time algorithm. We overcome it and obtain yet another application of runs in strings.

Cite as

Maxime Crochemore, Costas S. Iliopoulos, Jakub Radoszewski, Wojciech Rytter, Juliusz Straszyński, Tomasz Waleń, and Wiktor Zuba. Linear-Time Computation of Shortest Covers of All Rotations of a String. In 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 223, pp. 22:1-22:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{crochemore_et_al:LIPIcs.CPM.2022.22,
  author =	{Crochemore, Maxime and Iliopoulos, Costas S. and Radoszewski, Jakub and Rytter, Wojciech and Straszy\'{n}ski, Juliusz and Wale\'{n}, Tomasz and Zuba, Wiktor},
  title =	{{Linear-Time Computation of Shortest Covers of All Rotations of a String}},
  booktitle =	{33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)},
  pages =	{22:1--22:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-234-1},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{223},
  editor =	{Bannai, Hideo and Holub, Jan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2022.22},
  URN =		{urn:nbn:de:0030-drops-161495},
  doi =		{10.4230/LIPIcs.CPM.2022.22},
  annote =	{Keywords: cover, quasiperiod, cyclic rotation, seed, run}
}
Document
Rectangular Tile Covers of 2D-Strings

Authors: Jakub Radoszewski, Wojciech Rytter, Juliusz Straszyński, Tomasz Waleń, and Wiktor Zuba

Published in: LIPIcs, Volume 223, 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)


Abstract
We consider tile covers of 2D-strings which are a generalization of periodicity of 1D-strings. We say that a 2D-string A is a tile cover of a 2D-string S if S can be decomposed into non-overlapping 2D-strings, each of them equal to A or to A^T, where A^T is the transpose of A. We show that all tile covers of a 2D-string of size N can be computed in 𝒪(N^{1+ε}) time for any ε > 0. We also show a linear-time algorithm for computing all 1D-strings being tile covers of a 2D-string.

Cite as

Jakub Radoszewski, Wojciech Rytter, Juliusz Straszyński, Tomasz Waleń, and Wiktor Zuba. Rectangular Tile Covers of 2D-Strings. In 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 223, pp. 23:1-23:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{radoszewski_et_al:LIPIcs.CPM.2022.23,
  author =	{Radoszewski, Jakub and Rytter, Wojciech and Straszy\'{n}ski, Juliusz and Wale\'{n}, Tomasz and Zuba, Wiktor},
  title =	{{Rectangular Tile Covers of 2D-Strings}},
  booktitle =	{33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)},
  pages =	{23:1--23:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-234-1},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{223},
  editor =	{Bannai, Hideo and Holub, Jan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2022.23},
  URN =		{urn:nbn:de:0030-drops-161508},
  doi =		{10.4230/LIPIcs.CPM.2022.23},
  annote =	{Keywords: tile cover, periodicity, efficient algorithm}
}
Document
Hardness of Detecting Abelian and Additive Square Factors in Strings

Authors: Jakub Radoszewski, Wojciech Rytter, Juliusz Straszyński, Tomasz Waleń, and Wiktor Zuba

Published in: LIPIcs, Volume 204, 29th Annual European Symposium on Algorithms (ESA 2021)


Abstract
We prove 3SUM-hardness (no strongly subquadratic-time algorithm, assuming the 3SUM conjecture) of several problems related to finding Abelian square and additive square factors in a string. In particular, we conclude conditional optimality of the state-of-the-art algorithms for finding such factors. Overall, we show 3SUM-hardness of (a) detecting an Abelian square factor of an odd half-length, (b) computing centers of all Abelian square factors, (c) detecting an additive square factor in a length-n string of integers of magnitude n^{𝒪(1)}, and (d) a problem of computing a double 3-term arithmetic progression (i.e., finding indices i ≠ j such that (x_i+x_j)/2 = x_{(i+j)/2}) in a sequence of integers x₁,… ,x_n of magnitude n^{𝒪(1)}. Problem (d) is essentially a convolution version of the AVERAGE problem that was proposed in a manuscript of Erickson. We obtain a conditional lower bound for it with the aid of techniques recently developed by Dudek et al. [STOC 2020]. Problem (d) immediately reduces to problem (c) and is a step in reductions to problems (a) and (b). In conditional lower bounds for problems (a) and (b) we apply an encoding of Amir et al. [ICALP 2014] and extend it using several string gadgets that include arbitrarily long Abelian-square-free strings. Our reductions also imply conditional lower bounds for detecting Abelian squares in strings over a constant-sized alphabet. We also show a subquadratic upper bound in this case, applying a result of Chan and Lewenstein [STOC 2015].

Cite as

Jakub Radoszewski, Wojciech Rytter, Juliusz Straszyński, Tomasz Waleń, and Wiktor Zuba. Hardness of Detecting Abelian and Additive Square Factors in Strings. In 29th Annual European Symposium on Algorithms (ESA 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 204, pp. 77:1-77:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{radoszewski_et_al:LIPIcs.ESA.2021.77,
  author =	{Radoszewski, Jakub and Rytter, Wojciech and Straszy\'{n}ski, Juliusz and Wale\'{n}, Tomasz and Zuba, Wiktor},
  title =	{{Hardness of Detecting Abelian and Additive Square Factors in Strings}},
  booktitle =	{29th Annual European Symposium on Algorithms (ESA 2021)},
  pages =	{77:1--77:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-204-4},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{204},
  editor =	{Mutzel, Petra and Pagh, Rasmus and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2021.77},
  URN =		{urn:nbn:de:0030-drops-146581},
  doi =		{10.4230/LIPIcs.ESA.2021.77},
  annote =	{Keywords: Abelian square, additive square, 3SUM problem}
}
Document
Computing Covers of 2D-Strings

Authors: Panagiotis Charalampopoulos, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń, and Wiktor Zuba

Published in: LIPIcs, Volume 191, 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)


Abstract
We consider two notions of covers of a two-dimensional string T. A (rectangular) subarray P of T is a 2D-cover of T if each position of T is in an occurrence of P in T. A one-dimensional string S is a 1D-cover of T if its vertical and horizontal occurrences in T cover all positions of T. We show how to compute the smallest-area 2D-cover of an m × n array T in the optimal 𝒪(N) time, where N = mn, all aperiodic 2D-covers of T in 𝒪(N log N) time, and all 2D-covers of T in N^{4/3}⋅ log^{𝒪(1)}N time. Further, we show how to compute all 1D-covers in the optimal 𝒪(N) time. Along the way, we show that the Klee’s measure of a set of rectangles, each of width and height at least √n, on an n × n grid can be maintained in √n⋅ log^{𝒪(1)}n time per insertion or deletion of a rectangle, a result which could be of independent interest.

Cite as

Panagiotis Charalampopoulos, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń, and Wiktor Zuba. Computing Covers of 2D-Strings. In 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 191, pp. 12:1-12:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{charalampopoulos_et_al:LIPIcs.CPM.2021.12,
  author =	{Charalampopoulos, Panagiotis and Radoszewski, Jakub and Rytter, Wojciech and Wale\'{n}, Tomasz and Zuba, Wiktor},
  title =	{{Computing Covers of 2D-Strings}},
  booktitle =	{32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)},
  pages =	{12:1--12:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-186-3},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{191},
  editor =	{Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2021.12},
  URN =		{urn:nbn:de:0030-drops-139635},
  doi =		{10.4230/LIPIcs.CPM.2021.12},
  annote =	{Keywords: 2D-string, cover, dynamic Klee’s measure problem}
}
Document
The Number of Repetitions in 2D-Strings

Authors: Panagiotis Charalampopoulos, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń, and Wiktor Zuba

Published in: LIPIcs, Volume 173, 28th Annual European Symposium on Algorithms (ESA 2020)


Abstract
The notions of periodicity and repetitions in strings, and hence these of runs and squares, naturally extend to two-dimensional strings. We consider two types of repetitions in 2D-strings: 2D-runs and quartics (quartics are a 2D-version of squares in standard strings). Amir et al. introduced 2D-runs, showed that there are 𝒪(n³) of them in an n × n 2D-string and presented a simple construction giving a lower bound of Ω(n²) for their number (Theoretical Computer Science, 2020). We make a significant step towards closing the gap between these bounds by showing that the number of 2D-runs in an n × n 2D-string is 𝒪(n² log² n). In particular, our bound implies that the 𝒪(n²log n + output) run-time of the algorithm of Amir et al. for computing 2D-runs is also 𝒪(n² log² n). We expect this result to allow for exploiting 2D-runs algorithmically in the area of 2D pattern matching. A quartic is a 2D-string composed of 2 × 2 identical blocks (2D-strings) that was introduced by Apostolico and Brimkov (Theoretical Computer Science, 2000), where by quartics they meant only primitively rooted quartics, i.e. built of a primitive block. Here our notion of quartics is more general and analogous to that of squares in 1D-strings. Apostolico and Brimkov showed that there are 𝒪(n² log² n) occurrences of primitively rooted quartics in an n × n 2D-string and that this bound is attainable. Consequently the number of distinct primitively rooted quartics is 𝒪(n² log² n). The straightforward bound for the maximal number of distinct general quartics is 𝒪(n⁴). Here, we prove that the number of distinct general quartics is also 𝒪(n² log² n). This extends the rich combinatorial study of the number of distinct squares in a 1D-string, that was initiated by Fraenkel and Simpson (Journal of Combinatorial Theory, Series A, 1998), to two dimensions. Finally, we show some algorithmic applications of 2D-runs. Specifically, we present algorithms for computing all occurrences of primitively rooted quartics and counting all general distinct quartics in 𝒪(n² log² n) time, which is quasi-linear with respect to the size of the input. The former algorithm is optimal due to the lower bound of Apostolico and Brimkov. The latter can be seen as a continuation of works on enumeration of distinct squares in 1D-strings using runs (Crochemore et al., Theoretical Computer Science, 2014). However, the methods used in 2D are different because of different properties of 2D-runs and quartics.

Cite as

Panagiotis Charalampopoulos, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń, and Wiktor Zuba. The Number of Repetitions in 2D-Strings. In 28th Annual European Symposium on Algorithms (ESA 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 173, pp. 32:1-32:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{charalampopoulos_et_al:LIPIcs.ESA.2020.32,
  author =	{Charalampopoulos, Panagiotis and Radoszewski, Jakub and Rytter, Wojciech and Wale\'{n}, Tomasz and Zuba, Wiktor},
  title =	{{The Number of Repetitions in 2D-Strings}},
  booktitle =	{28th Annual European Symposium on Algorithms (ESA 2020)},
  pages =	{32:1--32:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-162-7},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{173},
  editor =	{Grandoni, Fabrizio and Herman, Grzegorz and Sanders, Peter},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2020.32},
  URN =		{urn:nbn:de:0030-drops-128987},
  doi =		{10.4230/LIPIcs.ESA.2020.32},
  annote =	{Keywords: 2D-run, quartic, run, square}
}
Document
Counting Distinct Patterns in Internal Dictionary Matching

Authors: Panagiotis Charalampopoulos, Tomasz Kociumaka, Manal Mohamed, Jakub Radoszewski, Wojciech Rytter, Juliusz Straszyński, Tomasz Waleń, and Wiktor Zuba

Published in: LIPIcs, Volume 161, 31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)


Abstract
We consider the problem of preprocessing a text T of length n and a dictionary 𝒟 in order to be able to efficiently answer queries CountDistinct(i,j), that is, given i and j return the number of patterns from 𝒟 that occur in the fragment T[i..j]. The dictionary is internal in the sense that each pattern in 𝒟 is given as a fragment of T. This way, the dictionary takes space proportional to the number of patterns d=|𝒟| rather than their total length, which could be Θ(n⋅ d). An 𝒪̃(n+d)-size data structure that answers CountDistinct(i,j) queries 𝒪(log n)-approximately in 𝒪̃(1) time was recently proposed in a work that introduced internal dictionary matching [ISAAC 2019]. Here we present an 𝒪̃(n+d)-size data structure that answers CountDistinct(i,j) queries 2-approximately in 𝒪̃(1) time. Using range queries, for any m, we give an 𝒪̃(min(nd/m,n²/m²)+d)-size data structure that answers CountDistinct(i,j) queries exactly in 𝒪̃(m) time. We also consider the special case when the dictionary consists of all square factors of the string. We design an 𝒪(n log² n)-size data structure that allows us to count distinct squares in a text fragment T[i..j] in 𝒪(log n) time.

Cite as

Panagiotis Charalampopoulos, Tomasz Kociumaka, Manal Mohamed, Jakub Radoszewski, Wojciech Rytter, Juliusz Straszyński, Tomasz Waleń, and Wiktor Zuba. Counting Distinct Patterns in Internal Dictionary Matching. In 31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020). Leibniz International Proceedings in Informatics (LIPIcs), Volume 161, pp. 8:1-8:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{charalampopoulos_et_al:LIPIcs.CPM.2020.8,
  author =	{Charalampopoulos, Panagiotis and Kociumaka, Tomasz and Mohamed, Manal and Radoszewski, Jakub and Rytter, Wojciech and Straszy\'{n}ski, Juliusz and Wale\'{n}, Tomasz and Zuba, Wiktor},
  title =	{{Counting Distinct Patterns in Internal Dictionary Matching}},
  booktitle =	{31st Annual Symposium on Combinatorial Pattern Matching (CPM 2020)},
  pages =	{8:1--8:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-149-8},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{161},
  editor =	{G{\o}rtz, Inge Li and Weimann, Oren},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2020.8},
  URN =		{urn:nbn:de:0030-drops-121336},
  doi =		{10.4230/LIPIcs.CPM.2020.8},
  annote =	{Keywords: dictionary matching, internal pattern matching, squares}
}
Document
Internal Dictionary Matching

Authors: Panagiotis Charalampopoulos, Tomasz Kociumaka, Manal Mohamed, Jakub Radoszewski, Wojciech Rytter, and Tomasz Waleń

Published in: LIPIcs, Volume 149, 30th International Symposium on Algorithms and Computation (ISAAC 2019)


Abstract
We introduce data structures answering queries concerning the occurrences of patterns from a given dictionary D in fragments of a given string T of length n. The dictionary is internal in the sense that each pattern in D is given as a fragment of T. This way, D takes space proportional to the number of patterns d=|D| rather than their total length, which could be Theta(n * d). In particular, we consider the following types of queries: reporting and counting all occurrences of patterns from D in a fragment T[i..j] (operations Report(i,j) and Count(i,j) below, as well as operation Exists(i,j) that returns true iff Count(i,j)>0) and reporting distinct patterns from D that occur in T[i..j] (operation ReportDistinct(i,j)). We show how to construct, in O((n+d) log^{O(1)} n) time, a data structure that answers each of these queries in time O(log^{O(1)} n+|output|) - see the table below for specific time and space complexities. Query | Preprocessing time | Space | Query time Exists(i,j) | O(n+d) | O(n) | O(1) Report(i,j) | O(n+d) | O(n+d) | O(1+|output|) ReportDistinct(i,j) | O(n log n+d) | O(n+d) | O(log n+|output|) Count(i,j) | O({n log n}/{log log n} + d log^{3/2} n) | O(n+d log n) | O({log^2n}/{log log n}) The case of counting patterns is much more involved and needs a combination of a locally consistent parsing with orthogonal range searching. Reporting distinct patterns, on the other hand, uses the structure of maximal repetitions in strings. Finally, we provide tight - up to subpolynomial factors - upper and lower bounds for the case of a dynamic dictionary.

Cite as

Panagiotis Charalampopoulos, Tomasz Kociumaka, Manal Mohamed, Jakub Radoszewski, Wojciech Rytter, and Tomasz Waleń. Internal Dictionary Matching. In 30th International Symposium on Algorithms and Computation (ISAAC 2019). Leibniz International Proceedings in Informatics (LIPIcs), Volume 149, pp. 22:1-22:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{charalampopoulos_et_al:LIPIcs.ISAAC.2019.22,
  author =	{Charalampopoulos, Panagiotis and Kociumaka, Tomasz and Mohamed, Manal and Radoszewski, Jakub and Rytter, Wojciech and Wale\'{n}, Tomasz},
  title =	{{Internal Dictionary Matching}},
  booktitle =	{30th International Symposium on Algorithms and Computation (ISAAC 2019)},
  pages =	{22:1--22:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-130-6},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{149},
  editor =	{Lu, Pinyan and Zhang, Guochuan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2019.22},
  URN =		{urn:nbn:de:0030-drops-115182},
  doi =		{10.4230/LIPIcs.ISAAC.2019.22},
  annote =	{Keywords: string algorithms, dictionary matching, internal pattern matching}
}
  • Refine by Author
  • 19 Rytter, Wojciech
  • 16 Radoszewski, Jakub
  • 11 Waleń, Tomasz
  • 10 Zuba, Wiktor
  • 9 Kociumaka, Tomasz
  • Show More...

  • Refine by Classification
  • 15 Theory of computation → Pattern matching
  • 1 Theory of computation → Automata over infinite objects
  • 1 Theory of computation → Bloom filters and hashing
  • 1 Theory of computation → Verification by model checking

  • Refine by Keyword
  • 4 cover
  • 4 internal pattern matching
  • 4 periodicity
  • 3 circular pattern matching
  • 3 edit distance
  • Show More...

  • Refine by Type
  • 54 document
  • 1 volume

  • Refine by Publication Year
  • 34 2017
  • 5 2024
  • 3 2022
  • 2 2008
  • 2 2018
  • Show More...