47 Search Results for "Starikovskaya, Tatiana"


Volume

LIPIcs, Volume 191

32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)

CPM 2021, July 5-7, 2021, Wrocław, Poland

Editors: Paweł Gawrychowski and Tatiana Starikovskaya

Document
Longest Common Extensions with Wildcards: Trade-Off and Applications

Authors: Gabriel Bathie, Panagiotis Charalampopoulos, and Tatiana Starikovskaya

Published in: LIPIcs, Volume 308, 32nd Annual European Symposium on Algorithms (ESA 2024)


Abstract
We study the Longest Common Extension (LCE) problem in a string containing wildcards. Wildcards (also called "don't cares" or "holes") are special characters that match any other character in the alphabet, similar to the character "?" in Unix commands or "." in regular expression engines. We consider the problem parametrized by G, the number of maximal contiguous groups of wildcards in the input string. Our main contribution is a simple data structure for this problem that can be built in O(n (G/t) log n) time, occupies O(nG/t) space, and answers queries in O(t) time, for any t ∈ [1 .. G]. Up to the O(log n) factor, this interpolates smoothly between the data structure of Crochemore et al. [JDA 2015], which has O(nG) preprocessing time and space, and O(1) query time, and a simple solution based on the "kangaroo jumping" technique [Landau and Vishkin, STOC 1986], which has O(n) preprocessing time and space, and O(G) query time. By establishing a connection between this problem and Boolean matrix multiplication, we show that our solution is optimal up to subpolynomial factors when G = Ω(n) under a widely believed hypothesis. In addition, we develop a new simple, deterministic and combinatorial algorithm for sparse Boolean matrix multiplication. Finally, we show that our data structure can be used to obtain efficient algorithms for approximate pattern matching and structural analysis of strings with wildcards. First, we consider the problem of pattern matching with k errors (i.e., edit operations) in the setting where both the pattern and the text may contain wildcards. The "kangaroo jumping" technique can be adapted to yield an algorithm for this problem with runtime O(n(k+G)), where G is the total number of maximal contiguous groups of wildcards in the text and the pattern and n is the length of the text. By combining "kangaroo jumping" with a tailor-made data structure for LCE queries, Akutsu [IPL 1995] devised an O(n√{km} polylog m)-time algorithm. We improve on both algorithms when k ≪ G ≪ m by giving an algorithm with runtime O(n(k + √{Gk log n})). Secondly, we give O(n√G log n)-time and O(n)-space algorithms for computing the prefix array, as well as the quantum/deterministic border and period arrays of a string with wildcards. This is an improvement over the O(n√{nlog n})-time algorithms of Iliopoulos and Radoszewski [CPM 2016] when G = O(n / log n).

Cite as

Gabriel Bathie, Panagiotis Charalampopoulos, and Tatiana Starikovskaya. Longest Common Extensions with Wildcards: Trade-Off and Applications. In 32nd Annual European Symposium on Algorithms (ESA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 308, pp. 19:1-19:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{bathie_et_al:LIPIcs.ESA.2024.19,
  author =	{Bathie, Gabriel and Charalampopoulos, Panagiotis and Starikovskaya, Tatiana},
  title =	{{Longest Common Extensions with Wildcards: Trade-Off and Applications}},
  booktitle =	{32nd Annual European Symposium on Algorithms (ESA 2024)},
  pages =	{19:1--19:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-338-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{308},
  editor =	{Chan, Timothy and Fischer, Johannes and Iacono, John and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2024.19},
  URN =		{urn:nbn:de:0030-drops-210904},
  doi =		{10.4230/LIPIcs.ESA.2024.19},
  annote =	{Keywords: Longest common prefix, longest common extension, wildcards, Boolean matrix multiplication, approximate pattern matching, periodicity arrays}
}
Document
Pattern Matching with Mismatches and Wildcards

Authors: Gabriel Bathie, Panagiotis Charalampopoulos, and Tatiana Starikovskaya

Published in: LIPIcs, Volume 308, 32nd Annual European Symposium on Algorithms (ESA 2024)


Abstract
In this work, we address the problem of approximate pattern matching with wildcards. Given a pattern P of length m containing D wildcards, a text T of length n, and an integer k, our objective is to identify all fragments of T within Hamming distance k from P. Our primary contribution is an algorithm with runtime 𝒪(n + (D+k)(G+k)⋅ n/m) for this problem. Here, G ≤ D represents the number of maximal wildcard fragments in P. We derive this algorithm by elaborating in a non-trivial way on the ideas presented by [Charalampopoulos, Kociumaka, and Wellnitz, FOCS'20] for pattern matching with mismatches (without wildcards). Our algorithm improves over the state of the art when D, G, and k are small relative to n. For instance, if m = n/2, k = G = n^{2/5}, and D = n^{3/5}, our algorithm operates in 𝒪(n) time, surpassing the Ω(n^{6/5}) time requirement of all previously known algorithms. In the case of exact pattern matching with wildcards (k = 0), we present a much simpler algorithm with runtime 𝒪(n + DG ⋅ n/m) that clearly illustrates our main technical innovation: the utilisation of positions of P that do not belong to any fragment of P with a density of wildcards much larger than D/m as anchors for the sought (approximate) occurrences. Notably, our algorithm outperforms the best-known 𝒪(n log m)-time FFT-based algorithms of [Cole and Hariharan, STOC'02] and [Clifford and Clifford, IPL'04] if DG = o(m log m). We complement our algorithmic results with a structural characterization of the k-mismatch occurrences of P. We demonstrate that in a text of length 𝒪(m), these occurrences can be partitioned into 𝒪((D+k)(G+k)) arithmetic progressions. Additionally, we construct an infinite family of examples with Ω((D+k)k) arithmetic progressions of occurrences, leveraging a combinatorial result on progression-free sets [Elkin, SODA'10].

Cite as

Gabriel Bathie, Panagiotis Charalampopoulos, and Tatiana Starikovskaya. Pattern Matching with Mismatches and Wildcards. In 32nd Annual European Symposium on Algorithms (ESA 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 308, pp. 20:1-20:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{bathie_et_al:LIPIcs.ESA.2024.20,
  author =	{Bathie, Gabriel and Charalampopoulos, Panagiotis and Starikovskaya, Tatiana},
  title =	{{Pattern Matching with Mismatches and Wildcards}},
  booktitle =	{32nd Annual European Symposium on Algorithms (ESA 2024)},
  pages =	{20:1--20:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-338-6},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{308},
  editor =	{Chan, Timothy and Fischer, Johannes and Iacono, John and Herman, Grzegorz},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2024.20},
  URN =		{urn:nbn:de:0030-drops-210910},
  doi =		{10.4230/LIPIcs.ESA.2024.20},
  annote =	{Keywords: pattern matching, wildcards, mismatches, Hamming distance}
}
Document
Internal Pattern Matching in Small Space and Applications

Authors: Gabriel Bathie, Panagiotis Charalampopoulos, and Tatiana Starikovskaya

Published in: LIPIcs, Volume 296, 35th Annual Symposium on Combinatorial Pattern Matching (CPM 2024)


Abstract
In this work, we consider pattern matching variants in small space, that is, in the read-only setting, where we want to bound the space usage on top of storing the strings. Our main contribution is a space-time trade-off for the Internal Pattern Matching (IPM) problem, where the goal is to construct a data structure over a string S of length n that allows one to answer the following type of queries: Compute the occurrences of a fragment P of S inside another fragment T of S, provided that |T| < 2|P|. For any τ ∈ [1 . . n/log² n], we present a nearly-optimal Õ(n/τ)-size data structure that can be built in Õ(n) time using Õ(n/τ) extra space, and answers IPM queries in O(τ+log n log³ log n) time. IPM queries have been identified as a crucial primitive operation for the analysis of algorithms on strings. In particular, the complexities of several recent algorithms for approximate pattern matching are expressed with regards to the number of calls to a small set of primitive operations that include IPM queries; our data structure allows us to port these results to the small-space setting. We further showcase the applicability of our IPM data structure by using it to obtain space-time trade-offs for the longest common substring and circular pattern matching problems in the asymmetric streaming setting.

Cite as

Gabriel Bathie, Panagiotis Charalampopoulos, and Tatiana Starikovskaya. Internal Pattern Matching in Small Space and Applications. In 35th Annual Symposium on Combinatorial Pattern Matching (CPM 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 296, pp. 4:1-4:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{bathie_et_al:LIPIcs.CPM.2024.4,
  author =	{Bathie, Gabriel and Charalampopoulos, Panagiotis and Starikovskaya, Tatiana},
  title =	{{Internal Pattern Matching in Small Space and Applications}},
  booktitle =	{35th Annual Symposium on Combinatorial Pattern Matching (CPM 2024)},
  pages =	{4:1--4:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-326-3},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{296},
  editor =	{Inenaga, Shunsuke and Puglisi, Simon J.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2024.4},
  URN =		{urn:nbn:de:0030-drops-201148},
  doi =		{10.4230/LIPIcs.CPM.2024.4},
  annote =	{Keywords: internal pattern matching, longest common substring, small-space algorithms}
}
Document
Small-Space Algorithms for the Online Language Distance Problem for Palindromes and Squares

Authors: Gabriel Bathie, Tomasz Kociumaka, and Tatiana Starikovskaya

Published in: LIPIcs, Volume 283, 34th International Symposium on Algorithms and Computation (ISAAC 2023)


Abstract
We study the online variant of the language distance problem for two classical formal languages, the language of palindromes and the language of squares, and for the two most fundamental distances, the Hamming distance and the edit (Levenshtein) distance. In this problem, defined for a fixed formal language L, we are given a string T of length n, and the task is to compute the minimal distance to L from every prefix of T. We focus on the low-distance regime, where one must compute only the distances smaller than a given threshold k. In this work, our contribution is twofold: 1) First, we show streaming algorithms, which access the input string T only through a single left-to-right scan. Both for palindromes and squares, our algorithms use O(k polylog n) space and time per character in the Hamming-distance case and O(k² polylog n) space and time per character in the edit-distance case. These algorithms are randomised by necessity, and they err with probability inverse-polynomial in n. 2) Second, we show deterministic read-only online algorithms, which are also provided with read-only random access to the already processed characters of T. Both for palindromes and squares, our algorithms use O(k polylog n) space and time per character in the Hamming-distance case and O(k⁴ polylog n) space and amortised time per character in the edit-distance case.

Cite as

Gabriel Bathie, Tomasz Kociumaka, and Tatiana Starikovskaya. Small-Space Algorithms for the Online Language Distance Problem for Palindromes and Squares. In 34th International Symposium on Algorithms and Computation (ISAAC 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 283, pp. 10:1-10:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{bathie_et_al:LIPIcs.ISAAC.2023.10,
  author =	{Bathie, Gabriel and Kociumaka, Tomasz and Starikovskaya, Tatiana},
  title =	{{Small-Space Algorithms for the Online Language Distance Problem for Palindromes and Squares}},
  booktitle =	{34th International Symposium on Algorithms and Computation (ISAAC 2023)},
  pages =	{10:1--10:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-289-1},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{283},
  editor =	{Iwata, Satoru and Kakimura, Naonori},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2023.10},
  URN =		{urn:nbn:de:0030-drops-193124},
  doi =		{10.4230/LIPIcs.ISAAC.2023.10},
  annote =	{Keywords: Approximate pattern matching, streaming algorithms, palindromes, squares}
}
Document
Compressed Indexing for Consecutive Occurrences

Authors: Paweł Gawrychowski, Garance Gourdel, Tatiana Starikovskaya, and Teresa Anna Steiner

Published in: LIPIcs, Volume 259, 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)


Abstract
The fundamental question considered in algorithms on strings is that of indexing, that is, preprocessing a given string for specific queries. By now we have a number of efficient solutions for this problem when the queries ask for an exact occurrence of a given pattern P. However, practical applications motivate the necessity of considering more complex queries, for example concerning near occurrences of two patterns. Recently, Bille et al. [CPM 2021] introduced a variant of such queries, called gapped consecutive occurrences, in which a query consists of two patterns P₁ and P₂ and a range [a,b], and one must find all consecutive occurrences (q₁,q₂) of P₁ and P₂ such that q₂-q₁ ∈ [a,b]. By their results, we cannot hope for a very efficient indexing structure for such queries, even if a = 0 is fixed (although at the same time they provided a non-trivial upper bound). Motivated by this, we focus on a text given as a straight-line program (SLP) and design an index taking space polynomial in the size of the grammar that answers such queries in time optimal up to polylog factors.

Cite as

Paweł Gawrychowski, Garance Gourdel, Tatiana Starikovskaya, and Teresa Anna Steiner. Compressed Indexing for Consecutive Occurrences. In 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 259, pp. 12:1-12:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{gawrychowski_et_al:LIPIcs.CPM.2023.12,
  author =	{Gawrychowski, Pawe{\l} and Gourdel, Garance and Starikovskaya, Tatiana and Steiner, Teresa Anna},
  title =	{{Compressed Indexing for Consecutive Occurrences}},
  booktitle =	{34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)},
  pages =	{12:1--12:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-276-1},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{259},
  editor =	{Bulteau, Laurent and Lipt\'{a}k, Zsuzsanna},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2023.12},
  URN =		{urn:nbn:de:0030-drops-179666},
  doi =		{10.4230/LIPIcs.CPM.2023.12},
  annote =	{Keywords: Compressed indexing, two patterns, consecutive occurrences}
}
Document
Invited Talk
Streaming Pattern Matching (Invited Talk)

Authors: Tatiana Starikovskaya

Published in: LIPIcs, Volume 212, 32nd International Symposium on Algorithms and Computation (ISAAC 2021)


Abstract
Many classical algorithms for string processing assume that the input can be accessed in full via constant-time random access, which poses a serious limitation in the modern era of data deluge. In this talk, we will focus on the streaming model of computation that allows to overcome this issue. In this model of computation, we assume that the input arrives as a stream, one character at a time, which captures a situation when the data are sequential measurements or an output of an algorithm. The space complexity is defined as all the space used, including the space used to store any information about the input, which allows to develop ultra-efficient algorithms. The first streaming algorithm for pattern matching was presented in the seminal paper of Porat and Porat in FOCS 2009. For a pattern of length m, the algorithm uses only O(log m) space, while any classical algorithm requires Ω(m) space. This result served as a foundation of the area of streaming algorithms for pattern matching. After a brief survey of the area, we will discuss two questions in more details: the k-mismatch problem and the pattern matching with k-edits problem. In the k-mismatch problem, one is given a pattern and a text, and the task is to find all substrings of the text that have at most k mismatches with the pattern. The current best algorithm for this problem was given by Clifford, Kociumaka, and Porat in SODA 2019, and for a pattern of length m it uses O(k log m) space and Õ(√k) time per character of the text. In the pattern matching with k-edits problem, the task is similar, but one must find substrings that can be transformed into the pattern by at most k edits, i.e. substitutions, insertions, and deletions of a character. For this problem, the first streaming algorithm was presented by Kociumaka, Porat, and Starikovskaya in FOCS 2021. The algorithm takes Õ(poly(k)) space and Õ(poly(k)) time per character of the text.

Cite as

Tatiana Starikovskaya. Streaming Pattern Matching (Invited Talk). In 32nd International Symposium on Algorithms and Computation (ISAAC 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 212, p. 1:1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{starikovskaya:LIPIcs.ISAAC.2021.1,
  author =	{Starikovskaya, Tatiana},
  title =	{{Streaming Pattern Matching}},
  booktitle =	{32nd International Symposium on Algorithms and Computation (ISAAC 2021)},
  pages =	{1:1--1:1},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-214-3},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{212},
  editor =	{Ahn, Hee-Kap and Sadakane, Kunihiko},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2021.1},
  URN =		{urn:nbn:de:0030-drops-154345},
  doi =		{10.4230/LIPIcs.ISAAC.2021.1},
  annote =	{Keywords: Streaming algorithms, Pattern matching, Hamming distance, Edit distance}
}
Document
Track B: Automata, Logic, Semantics, and Theory of Programming
Property Testing of Regular Languages with Applications to Streaming Property Testing of Visibly Pushdown Languages

Authors: Gabriel Bathie and Tatiana Starikovskaya

Published in: LIPIcs, Volume 198, 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)


Abstract
In this work, we revisit the problem of testing membership in regular languages, first studied by Alon et al. [Alon et al., 2001]. We develop a one-sided error property tester for regular languages under weighted edit distance that makes 𝒪(ε^{-1} log(1/ε)) non-adaptive queries, assuming that the language is described by an automaton of constant size. Moreover, we show a matching lower bound, essentially closing the problem for the edit distance. As an application, we improve the space bound of the current best streaming property testing algorithm for visibly pushdown languages from 𝒪(ε^{-4} log⁶ n) to 𝒪(ε^{-3} log⁵ n log log n), where n is the size of the input. Finally, we provide a Ω(max(ε^{-1}, log n)) lower bound on the memory necessary to test visibly pushdown languages in the streaming model, significantly narrowing the gap between the known bounds.

Cite as

Gabriel Bathie and Tatiana Starikovskaya. Property Testing of Regular Languages with Applications to Streaming Property Testing of Visibly Pushdown Languages. In 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 198, pp. 119:1-119:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{bathie_et_al:LIPIcs.ICALP.2021.119,
  author =	{Bathie, Gabriel and Starikovskaya, Tatiana},
  title =	{{Property Testing of Regular Languages with Applications to Streaming Property Testing of Visibly Pushdown Languages}},
  booktitle =	{48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)},
  pages =	{119:1--119:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-195-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{198},
  editor =	{Bansal, Nikhil and Merelli, Emanuela and Worrell, James},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2021.119},
  URN =		{urn:nbn:de:0030-drops-141881},
  doi =		{10.4230/LIPIcs.ICALP.2021.119},
  annote =	{Keywords: property testing, regular languages, streaming algorithms, visibly pushdown languages}
}
Document
Complete Volume
LIPIcs, Volume 191, CPM 2021, Complete Volume

Authors: Paweł Gawrychowski and Tatiana Starikovskaya

Published in: LIPIcs, Volume 191, 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)


Abstract
LIPIcs, Volume 191, CPM 2021, Complete Volume

Cite as

32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 191, pp. 1-404, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@Proceedings{gawrychowski_et_al:LIPIcs.CPM.2021,
  title =	{{LIPIcs, Volume 191, CPM 2021, Complete Volume}},
  booktitle =	{32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)},
  pages =	{1--404},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-186-3},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{191},
  editor =	{Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2021},
  URN =		{urn:nbn:de:0030-drops-139509},
  doi =		{10.4230/LIPIcs.CPM.2021},
  annote =	{Keywords: LIPIcs, Volume 191, CPM 2021, Complete Volume}
}
Document
Front Matter
Front Matter, Table of Contents, Preface, Conference Organization

Authors: Paweł Gawrychowski and Tatiana Starikovskaya

Published in: LIPIcs, Volume 191, 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)


Abstract
Front Matter, Table of Contents, Preface, Conference Organization

Cite as

32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 191, pp. 0:i-0:xiv, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{gawrychowski_et_al:LIPIcs.CPM.2021.0,
  author =	{Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
  title =	{{Front Matter, Table of Contents, Preface, Conference Organization}},
  booktitle =	{32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)},
  pages =	{0:i--0:xiv},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-186-3},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{191},
  editor =	{Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2021.0},
  URN =		{urn:nbn:de:0030-drops-139512},
  doi =		{10.4230/LIPIcs.CPM.2021.0},
  annote =	{Keywords: Front Matter, Table of Contents, Preface, Conference Organization}
}
Document
Invited Talk
Repetitions in Strings: A "Constant" Problem (Invited Talk)

Authors: Hideo Bannai

Published in: LIPIcs, Volume 191, 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)


Abstract
Repeating structures in strings is one of the most fundamental characteristics of strings, and has been an important topic in the field of combinatorics on words and combinatorial pattern matching since their beginnings. In this talk, I will focus on squares and maximal repetitions and review the "runs" theorem [Hideo Bannai et al., 2017] as well as related results (e.g. [Aviezri S. Fraenkel and Jamie Simpson, 1998; Yuta Fujishige et al., 2017; Ryo Sugahara et al., 2019; Philip Bille et al., 2020; Hideo Bannai et al., 2020; Jonas Ellert and Johannes Fischer, 2021]) which address the two main questions: how many of them can be contained in a string of given length, and algorithms for computing them.

Cite as

Hideo Bannai. Repetitions in Strings: A "Constant" Problem (Invited Talk). In 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 191, p. 1:1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{bannai:LIPIcs.CPM.2021.1,
  author =	{Bannai, Hideo},
  title =	{{Repetitions in Strings: A "Constant" Problem}},
  booktitle =	{32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)},
  pages =	{1:1--1:1},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-186-3},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{191},
  editor =	{Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2021.1},
  URN =		{urn:nbn:de:0030-drops-139523},
  doi =		{10.4230/LIPIcs.CPM.2021.1},
  annote =	{Keywords: Maximal repetitions, Squares, Lyndon words}
}
Document
Invited Talk
Computing Edit Distance (Invited Talk)

Authors: Michal Koucký

Published in: LIPIcs, Volume 191, 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)


Abstract
The edit distance (or Levenshtein distance) between two strings x, y is the minimum number of character insertions, deletions, and substitutions needed to convert x into y. It has numerous applications in various fields from text processing to bioinformatics so algorithms for edit distance computation attract lot of attention. In this talk I will survey recent progress on computational aspects of edit distance in several contexts: computing edit distance approximately, sketching and computing it in streaming model, exchanging strings in communication complexity model, and building error correcting codes for edit distance. I will point out many problems that are still open in those areas.

Cite as

Michal Koucký. Computing Edit Distance (Invited Talk). In 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 191, p. 2:1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{koucky:LIPIcs.CPM.2021.2,
  author =	{Kouck\'{y}, Michal},
  title =	{{Computing Edit Distance}},
  booktitle =	{32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)},
  pages =	{2:1--2:1},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-186-3},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{191},
  editor =	{Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2021.2},
  URN =		{urn:nbn:de:0030-drops-139534},
  doi =		{10.4230/LIPIcs.CPM.2021.2},
  annote =	{Keywords: edit distance, streaming algorithms, approximation algorithms, sketching}
}
Document
Invited Talk
On-Line Pattern Matching on D-Texts (Invited Talk)

Authors: Nadia Pisanti

Published in: LIPIcs, Volume 191, 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)


Abstract
The Elastic Degenerate String Matching (EDSM) problem is defined as that of finding an occurrence of a pattern P of length m in an ED-text T. A D-text (Degenerate text) is a string that actually represents a set of similar and aligned strings (e.g. a pan-genome [The Computational Pan-Genomics Consortium, 2018]) by collapsing common fragments into a standard string, and representing variants with sets of alternative substrings. When such substrings are not bound to have the same size, then we talk about elastic D-strings (ED-strings). In [R.Grossi et al., 2017] we gave an O(nm²+N) time on-line algorithm for EDSM, where n is the length of T and N is its size, defined as the total number of letters. A fundamental toolkit of our algorithm is the O(m²+N) time solution of the later called Active Prefixes problem (AP). In [K.Aoyama et al., 2018], a O(m^{1.5} √{log m}+N) solution for AP was shown, leading to a O(nm^{1.5} √{log m}+N) time solution for EDSM. The natural open problem was thus whether the 1.5 exponent could furtherly be decreased. In [G.Bernardini et al., 2019], we prove several properties that answer this and other questions: we give a conditional O(nm^{1.5}+N) lower bound for EDSM, proving that a combinatorial algorithm solving EDSM in O(nm^{1.5-ε} +N) time would break the Boolean Matrix Multiplication (BMM) conjecture; we use this result as a hint to devise a non-combinatorial algorithm that solves EDSM in O(nm^{1.381}+N) time; we do so by successfully combining Fast Fourier Transform and properties of string periodicity. In my talk I will overview the results above, as well as some interesting side results: the extension to a dictionary rather than a single pattern [S.P.Pissis and A.Retha, 2018], the introduction of errors [G.Bernardini et al., 2020], and a notion of matching among D-strings with its linear time solution [M.Alzamel et al., 2020].

Cite as

Nadia Pisanti. On-Line Pattern Matching on D-Texts (Invited Talk). In 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 191, pp. 3:1-3:2, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{pisanti:LIPIcs.CPM.2021.3,
  author =	{Pisanti, Nadia},
  title =	{{On-Line Pattern Matching on D-Texts}},
  booktitle =	{32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)},
  pages =	{3:1--3:2},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-186-3},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{191},
  editor =	{Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2021.3},
  URN =		{urn:nbn:de:0030-drops-139548},
  doi =		{10.4230/LIPIcs.CPM.2021.3},
  annote =	{Keywords: pattern matching, elastic-degenerate string, matrix multiplication}
}
Document
Ranking Bracelets in Polynomial Time

Authors: Duncan Adamson, Vladimir V. Gusev, Igor Potapov, and Argyrios Deligkas

Published in: LIPIcs, Volume 191, 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)


Abstract
The main result of the paper is the first polynomial-time algorithm for ranking bracelets. The time-complexity of the algorithm is O(k²⋅ n⁴), where k is the size of the alphabet and n is the length of the considered bracelets. The key part of the algorithm is to compute the rank of any word with respect to the set of bracelets by finding three other ranks: the rank over all necklaces, the rank over palindromic necklaces, and the rank over enclosing apalindromic necklaces. The last two concepts are introduced in this paper. These ranks are key components to our algorithm in order to decompose the problem into parts. Additionally, this ranking procedure is used to build a polynomial-time unranking algorithm.

Cite as

Duncan Adamson, Vladimir V. Gusev, Igor Potapov, and Argyrios Deligkas. Ranking Bracelets in Polynomial Time. In 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 191, pp. 4:1-4:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{adamson_et_al:LIPIcs.CPM.2021.4,
  author =	{Adamson, Duncan and Gusev, Vladimir V. and Potapov, Igor and Deligkas, Argyrios},
  title =	{{Ranking Bracelets in Polynomial Time}},
  booktitle =	{32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)},
  pages =	{4:1--4:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-186-3},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{191},
  editor =	{Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2021.4},
  URN =		{urn:nbn:de:0030-drops-139554},
  doi =		{10.4230/LIPIcs.CPM.2021.4},
  annote =	{Keywords: Bracelets, Ranking, Necklaces}
}
Document
The k-Mappability Problem Revisited

Authors: Amihood Amir, Itai Boneh, and Eitan Kondratovsky

Published in: LIPIcs, Volume 191, 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)


Abstract
The k-mappability problem has two integers parameters m and k. For every subword of size m in a text S, we wish to report the number of indices in S in which the word occurs with at most k mismatches. The problem was lately tackled by Alzamel et al. [Mai Alzamel et al., 2018]. For a text with constant alphabet Σ and k ∈ O(1), they present an algorithm with linear space and O(nlog^{k+1}n) time. For the case in which k = 1 and a constant size alphabet, a faster algorithm with linear space and O(nlog(n)log log(n)) time was presented in [Mai Alzamel et al., 2020]. In this work, we enhance the techniques of [Mai Alzamel et al., 2020] to obtain an algorithm with linear space and O(n log(n)) time for k = 1. Our algorithm removes the constraint of the alphabet being of constant size. We also present linear algorithms for the case of k = 1, |Σ| ∈ O(1) and m = Ω(√n).

Cite as

Amihood Amir, Itai Boneh, and Eitan Kondratovsky. The k-Mappability Problem Revisited. In 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 191, pp. 5:1-5:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{amir_et_al:LIPIcs.CPM.2021.5,
  author =	{Amir, Amihood and Boneh, Itai and Kondratovsky, Eitan},
  title =	{{The k-Mappability Problem Revisited}},
  booktitle =	{32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)},
  pages =	{5:1--5:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-186-3},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{191},
  editor =	{Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2021.5},
  URN =		{urn:nbn:de:0030-drops-139566},
  doi =		{10.4230/LIPIcs.CPM.2021.5},
  annote =	{Keywords: Pattern Matching, Hamming Distance, Suffix Tree, Suffix Array}
}
  • Refine by Author
  • 21 Starikovskaya, Tatiana
  • 6 Gawrychowski, Paweł
  • 5 Bathie, Gabriel
  • 5 Charalampopoulos, Panagiotis
  • 3 Pissis, Solon P.
  • Show More...

  • Refine by Classification

  • Refine by Keyword
  • 5 Hamming distance
  • 5 pattern matching
  • 5 streaming algorithms
  • 3 Streaming algorithms
  • 3 Suffix Array
  • Show More...

  • Refine by Type
  • 46 document
  • 1 volume

  • Refine by Publication Year
  • 30 2021
  • 4 2020
  • 3 2019
  • 3 2024
  • 2 2016
  • Show More...

Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail