44 Search Results for "Starikovskaya, Tatiana"


Volume

LIPIcs, Volume 191

32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)

CPM 2021, July 5-7, 2021, Wrocław, Poland

Editors: Paweł Gawrychowski and Tatiana Starikovskaya

Document
Small-Space Algorithms for the Online Language Distance Problem for Palindromes and Squares

Authors: Gabriel Bathie, Tomasz Kociumaka, and Tatiana Starikovskaya

Published in: LIPIcs, Volume 283, 34th International Symposium on Algorithms and Computation (ISAAC 2023)


Abstract
We study the online variant of the language distance problem for two classical formal languages, the language of palindromes and the language of squares, and for the two most fundamental distances, the Hamming distance and the edit (Levenshtein) distance. In this problem, defined for a fixed formal language L, we are given a string T of length n, and the task is to compute the minimal distance to L from every prefix of T. We focus on the low-distance regime, where one must compute only the distances smaller than a given threshold k. In this work, our contribution is twofold: 1) First, we show streaming algorithms, which access the input string T only through a single left-to-right scan. Both for palindromes and squares, our algorithms use O(k polylog n) space and time per character in the Hamming-distance case and O(k² polylog n) space and time per character in the edit-distance case. These algorithms are randomised by necessity, and they err with probability inverse-polynomial in n. 2) Second, we show deterministic read-only online algorithms, which are also provided with read-only random access to the already processed characters of T. Both for palindromes and squares, our algorithms use O(k polylog n) space and time per character in the Hamming-distance case and O(k⁴ polylog n) space and amortised time per character in the edit-distance case.

Cite as

Gabriel Bathie, Tomasz Kociumaka, and Tatiana Starikovskaya. Small-Space Algorithms for the Online Language Distance Problem for Palindromes and Squares. In 34th International Symposium on Algorithms and Computation (ISAAC 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 283, pp. 10:1-10:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{bathie_et_al:LIPIcs.ISAAC.2023.10,
  author =	{Bathie, Gabriel and Kociumaka, Tomasz and Starikovskaya, Tatiana},
  title =	{{Small-Space Algorithms for the Online Language Distance Problem for Palindromes and Squares}},
  booktitle =	{34th International Symposium on Algorithms and Computation (ISAAC 2023)},
  pages =	{10:1--10:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-289-1},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{283},
  editor =	{Iwata, Satoru and Kakimura, Naonori},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2023.10},
  URN =		{urn:nbn:de:0030-drops-193124},
  doi =		{10.4230/LIPIcs.ISAAC.2023.10},
  annote =	{Keywords: Approximate pattern matching, streaming algorithms, palindromes, squares}
}
Document
Compressed Indexing for Consecutive Occurrences

Authors: Paweł Gawrychowski, Garance Gourdel, Tatiana Starikovskaya, and Teresa Anna Steiner

Published in: LIPIcs, Volume 259, 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)


Abstract
The fundamental question considered in algorithms on strings is that of indexing, that is, preprocessing a given string for specific queries. By now we have a number of efficient solutions for this problem when the queries ask for an exact occurrence of a given pattern P. However, practical applications motivate the necessity of considering more complex queries, for example concerning near occurrences of two patterns. Recently, Bille et al. [CPM 2021] introduced a variant of such queries, called gapped consecutive occurrences, in which a query consists of two patterns P₁ and P₂ and a range [a,b], and one must find all consecutive occurrences (q₁,q₂) of P₁ and P₂ such that q₂-q₁ ∈ [a,b]. By their results, we cannot hope for a very efficient indexing structure for such queries, even if a = 0 is fixed (although at the same time they provided a non-trivial upper bound). Motivated by this, we focus on a text given as a straight-line program (SLP) and design an index taking space polynomial in the size of the grammar that answers such queries in time optimal up to polylog factors.

Cite as

Paweł Gawrychowski, Garance Gourdel, Tatiana Starikovskaya, and Teresa Anna Steiner. Compressed Indexing for Consecutive Occurrences. In 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 259, pp. 12:1-12:22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{gawrychowski_et_al:LIPIcs.CPM.2023.12,
  author =	{Gawrychowski, Pawe{\l} and Gourdel, Garance and Starikovskaya, Tatiana and Steiner, Teresa Anna},
  title =	{{Compressed Indexing for Consecutive Occurrences}},
  booktitle =	{34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)},
  pages =	{12:1--12:22},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-276-1},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{259},
  editor =	{Bulteau, Laurent and Lipt\'{a}k, Zsuzsanna},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2023.12},
  URN =		{urn:nbn:de:0030-drops-179666},
  doi =		{10.4230/LIPIcs.CPM.2023.12},
  annote =	{Keywords: Compressed indexing, two patterns, consecutive occurrences}
}
Document
Invited Talk
Streaming Pattern Matching (Invited Talk)

Authors: Tatiana Starikovskaya

Published in: LIPIcs, Volume 212, 32nd International Symposium on Algorithms and Computation (ISAAC 2021)


Abstract
Many classical algorithms for string processing assume that the input can be accessed in full via constant-time random access, which poses a serious limitation in the modern era of data deluge. In this talk, we will focus on the streaming model of computation that allows to overcome this issue. In this model of computation, we assume that the input arrives as a stream, one character at a time, which captures a situation when the data are sequential measurements or an output of an algorithm. The space complexity is defined as all the space used, including the space used to store any information about the input, which allows to develop ultra-efficient algorithms. The first streaming algorithm for pattern matching was presented in the seminal paper of Porat and Porat in FOCS 2009. For a pattern of length m, the algorithm uses only O(log m) space, while any classical algorithm requires Ω(m) space. This result served as a foundation of the area of streaming algorithms for pattern matching. After a brief survey of the area, we will discuss two questions in more details: the k-mismatch problem and the pattern matching with k-edits problem. In the k-mismatch problem, one is given a pattern and a text, and the task is to find all substrings of the text that have at most k mismatches with the pattern. The current best algorithm for this problem was given by Clifford, Kociumaka, and Porat in SODA 2019, and for a pattern of length m it uses O(k log m) space and Õ(√k) time per character of the text. In the pattern matching with k-edits problem, the task is similar, but one must find substrings that can be transformed into the pattern by at most k edits, i.e. substitutions, insertions, and deletions of a character. For this problem, the first streaming algorithm was presented by Kociumaka, Porat, and Starikovskaya in FOCS 2021. The algorithm takes Õ(poly(k)) space and Õ(poly(k)) time per character of the text.

Cite as

Tatiana Starikovskaya. Streaming Pattern Matching (Invited Talk). In 32nd International Symposium on Algorithms and Computation (ISAAC 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 212, p. 1:1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{starikovskaya:LIPIcs.ISAAC.2021.1,
  author =	{Starikovskaya, Tatiana},
  title =	{{Streaming Pattern Matching}},
  booktitle =	{32nd International Symposium on Algorithms and Computation (ISAAC 2021)},
  pages =	{1:1--1:1},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-214-3},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{212},
  editor =	{Ahn, Hee-Kap and Sadakane, Kunihiko},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2021.1},
  URN =		{urn:nbn:de:0030-drops-154345},
  doi =		{10.4230/LIPIcs.ISAAC.2021.1},
  annote =	{Keywords: Streaming algorithms, Pattern matching, Hamming distance, Edit distance}
}
Document
Track B: Automata, Logic, Semantics, and Theory of Programming
Property Testing of Regular Languages with Applications to Streaming Property Testing of Visibly Pushdown Languages

Authors: Gabriel Bathie and Tatiana Starikovskaya

Published in: LIPIcs, Volume 198, 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)


Abstract
In this work, we revisit the problem of testing membership in regular languages, first studied by Alon et al. [Alon et al., 2001]. We develop a one-sided error property tester for regular languages under weighted edit distance that makes 𝒪(ε^{-1} log(1/ε)) non-adaptive queries, assuming that the language is described by an automaton of constant size. Moreover, we show a matching lower bound, essentially closing the problem for the edit distance. As an application, we improve the space bound of the current best streaming property testing algorithm for visibly pushdown languages from 𝒪(ε^{-4} log⁶ n) to 𝒪(ε^{-3} log⁵ n log log n), where n is the size of the input. Finally, we provide a Ω(max(ε^{-1}, log n)) lower bound on the memory necessary to test visibly pushdown languages in the streaming model, significantly narrowing the gap between the known bounds.

Cite as

Gabriel Bathie and Tatiana Starikovskaya. Property Testing of Regular Languages with Applications to Streaming Property Testing of Visibly Pushdown Languages. In 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 198, pp. 119:1-119:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{bathie_et_al:LIPIcs.ICALP.2021.119,
  author =	{Bathie, Gabriel and Starikovskaya, Tatiana},
  title =	{{Property Testing of Regular Languages with Applications to Streaming Property Testing of Visibly Pushdown Languages}},
  booktitle =	{48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)},
  pages =	{119:1--119:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-195-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{198},
  editor =	{Bansal, Nikhil and Merelli, Emanuela and Worrell, James},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2021.119},
  URN =		{urn:nbn:de:0030-drops-141881},
  doi =		{10.4230/LIPIcs.ICALP.2021.119},
  annote =	{Keywords: property testing, regular languages, streaming algorithms, visibly pushdown languages}
}
Document
Complete Volume
LIPIcs, Volume 191, CPM 2021, Complete Volume

Authors: Paweł Gawrychowski and Tatiana Starikovskaya

Published in: LIPIcs, Volume 191, 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)


Abstract
LIPIcs, Volume 191, CPM 2021, Complete Volume

Cite as

32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 191, pp. 1-404, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@Proceedings{gawrychowski_et_al:LIPIcs.CPM.2021,
  title =	{{LIPIcs, Volume 191, CPM 2021, Complete Volume}},
  booktitle =	{32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)},
  pages =	{1--404},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-186-3},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{191},
  editor =	{Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2021},
  URN =		{urn:nbn:de:0030-drops-139509},
  doi =		{10.4230/LIPIcs.CPM.2021},
  annote =	{Keywords: LIPIcs, Volume 191, CPM 2021, Complete Volume}
}
Document
Front Matter
Front Matter, Table of Contents, Preface, Conference Organization

Authors: Paweł Gawrychowski and Tatiana Starikovskaya

Published in: LIPIcs, Volume 191, 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)


Abstract
Front Matter, Table of Contents, Preface, Conference Organization

Cite as

32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 191, pp. 0:i-0:xiv, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{gawrychowski_et_al:LIPIcs.CPM.2021.0,
  author =	{Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
  title =	{{Front Matter, Table of Contents, Preface, Conference Organization}},
  booktitle =	{32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)},
  pages =	{0:i--0:xiv},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-186-3},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{191},
  editor =	{Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2021.0},
  URN =		{urn:nbn:de:0030-drops-139512},
  doi =		{10.4230/LIPIcs.CPM.2021.0},
  annote =	{Keywords: Front Matter, Table of Contents, Preface, Conference Organization}
}
Document
Invited Talk
Repetitions in Strings: A "Constant" Problem (Invited Talk)

Authors: Hideo Bannai

Published in: LIPIcs, Volume 191, 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)


Abstract
Repeating structures in strings is one of the most fundamental characteristics of strings, and has been an important topic in the field of combinatorics on words and combinatorial pattern matching since their beginnings. In this talk, I will focus on squares and maximal repetitions and review the "runs" theorem [Hideo Bannai et al., 2017] as well as related results (e.g. [Aviezri S. Fraenkel and Jamie Simpson, 1998; Yuta Fujishige et al., 2017; Ryo Sugahara et al., 2019; Philip Bille et al., 2020; Hideo Bannai et al., 2020; Jonas Ellert and Johannes Fischer, 2021]) which address the two main questions: how many of them can be contained in a string of given length, and algorithms for computing them.

Cite as

Hideo Bannai. Repetitions in Strings: A "Constant" Problem (Invited Talk). In 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 191, p. 1:1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{bannai:LIPIcs.CPM.2021.1,
  author =	{Bannai, Hideo},
  title =	{{Repetitions in Strings: A "Constant" Problem}},
  booktitle =	{32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)},
  pages =	{1:1--1:1},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-186-3},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{191},
  editor =	{Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2021.1},
  URN =		{urn:nbn:de:0030-drops-139523},
  doi =		{10.4230/LIPIcs.CPM.2021.1},
  annote =	{Keywords: Maximal repetitions, Squares, Lyndon words}
}
Document
Invited Talk
Computing Edit Distance (Invited Talk)

Authors: Michal Koucký

Published in: LIPIcs, Volume 191, 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)


Abstract
The edit distance (or Levenshtein distance) between two strings x, y is the minimum number of character insertions, deletions, and substitutions needed to convert x into y. It has numerous applications in various fields from text processing to bioinformatics so algorithms for edit distance computation attract lot of attention. In this talk I will survey recent progress on computational aspects of edit distance in several contexts: computing edit distance approximately, sketching and computing it in streaming model, exchanging strings in communication complexity model, and building error correcting codes for edit distance. I will point out many problems that are still open in those areas.

Cite as

Michal Koucký. Computing Edit Distance (Invited Talk). In 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 191, p. 2:1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{koucky:LIPIcs.CPM.2021.2,
  author =	{Kouck\'{y}, Michal},
  title =	{{Computing Edit Distance}},
  booktitle =	{32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)},
  pages =	{2:1--2:1},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-186-3},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{191},
  editor =	{Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2021.2},
  URN =		{urn:nbn:de:0030-drops-139534},
  doi =		{10.4230/LIPIcs.CPM.2021.2},
  annote =	{Keywords: edit distance, streaming algorithms, approximation algorithms, sketching}
}
Document
Invited Talk
On-Line Pattern Matching on D-Texts (Invited Talk)

Authors: Nadia Pisanti

Published in: LIPIcs, Volume 191, 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)


Abstract
The Elastic Degenerate String Matching (EDSM) problem is defined as that of finding an occurrence of a pattern P of length m in an ED-text T. A D-text (Degenerate text) is a string that actually represents a set of similar and aligned strings (e.g. a pan-genome [The Computational Pan-Genomics Consortium, 2018]) by collapsing common fragments into a standard string, and representing variants with sets of alternative substrings. When such substrings are not bound to have the same size, then we talk about elastic D-strings (ED-strings). In [R.Grossi et al., 2017] we gave an O(nm²+N) time on-line algorithm for EDSM, where n is the length of T and N is its size, defined as the total number of letters. A fundamental toolkit of our algorithm is the O(m²+N) time solution of the later called Active Prefixes problem (AP). In [K.Aoyama et al., 2018], a O(m^{1.5} √{log m}+N) solution for AP was shown, leading to a O(nm^{1.5} √{log m}+N) time solution for EDSM. The natural open problem was thus whether the 1.5 exponent could furtherly be decreased. In [G.Bernardini et al., 2019], we prove several properties that answer this and other questions: we give a conditional O(nm^{1.5}+N) lower bound for EDSM, proving that a combinatorial algorithm solving EDSM in O(nm^{1.5-ε} +N) time would break the Boolean Matrix Multiplication (BMM) conjecture; we use this result as a hint to devise a non-combinatorial algorithm that solves EDSM in O(nm^{1.381}+N) time; we do so by successfully combining Fast Fourier Transform and properties of string periodicity. In my talk I will overview the results above, as well as some interesting side results: the extension to a dictionary rather than a single pattern [S.P.Pissis and A.Retha, 2018], the introduction of errors [G.Bernardini et al., 2020], and a notion of matching among D-strings with its linear time solution [M.Alzamel et al., 2020].

Cite as

Nadia Pisanti. On-Line Pattern Matching on D-Texts (Invited Talk). In 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 191, pp. 3:1-3:2, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{pisanti:LIPIcs.CPM.2021.3,
  author =	{Pisanti, Nadia},
  title =	{{On-Line Pattern Matching on D-Texts}},
  booktitle =	{32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)},
  pages =	{3:1--3:2},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-186-3},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{191},
  editor =	{Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2021.3},
  URN =		{urn:nbn:de:0030-drops-139548},
  doi =		{10.4230/LIPIcs.CPM.2021.3},
  annote =	{Keywords: pattern matching, elastic-degenerate string, matrix multiplication}
}
Document
Ranking Bracelets in Polynomial Time

Authors: Duncan Adamson, Vladimir V. Gusev, Igor Potapov, and Argyrios Deligkas

Published in: LIPIcs, Volume 191, 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)


Abstract
The main result of the paper is the first polynomial-time algorithm for ranking bracelets. The time-complexity of the algorithm is O(k²⋅ n⁴), where k is the size of the alphabet and n is the length of the considered bracelets. The key part of the algorithm is to compute the rank of any word with respect to the set of bracelets by finding three other ranks: the rank over all necklaces, the rank over palindromic necklaces, and the rank over enclosing apalindromic necklaces. The last two concepts are introduced in this paper. These ranks are key components to our algorithm in order to decompose the problem into parts. Additionally, this ranking procedure is used to build a polynomial-time unranking algorithm.

Cite as

Duncan Adamson, Vladimir V. Gusev, Igor Potapov, and Argyrios Deligkas. Ranking Bracelets in Polynomial Time. In 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 191, pp. 4:1-4:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{adamson_et_al:LIPIcs.CPM.2021.4,
  author =	{Adamson, Duncan and Gusev, Vladimir V. and Potapov, Igor and Deligkas, Argyrios},
  title =	{{Ranking Bracelets in Polynomial Time}},
  booktitle =	{32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)},
  pages =	{4:1--4:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-186-3},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{191},
  editor =	{Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2021.4},
  URN =		{urn:nbn:de:0030-drops-139554},
  doi =		{10.4230/LIPIcs.CPM.2021.4},
  annote =	{Keywords: Bracelets, Ranking, Necklaces}
}
Document
The k-Mappability Problem Revisited

Authors: Amihood Amir, Itai Boneh, and Eitan Kondratovsky

Published in: LIPIcs, Volume 191, 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)


Abstract
The k-mappability problem has two integers parameters m and k. For every subword of size m in a text S, we wish to report the number of indices in S in which the word occurs with at most k mismatches. The problem was lately tackled by Alzamel et al. [Mai Alzamel et al., 2018]. For a text with constant alphabet Σ and k ∈ O(1), they present an algorithm with linear space and O(nlog^{k+1}n) time. For the case in which k = 1 and a constant size alphabet, a faster algorithm with linear space and O(nlog(n)log log(n)) time was presented in [Mai Alzamel et al., 2020]. In this work, we enhance the techniques of [Mai Alzamel et al., 2020] to obtain an algorithm with linear space and O(n log(n)) time for k = 1. Our algorithm removes the constraint of the alphabet being of constant size. We also present linear algorithms for the case of k = 1, |Σ| ∈ O(1) and m = Ω(√n).

Cite as

Amihood Amir, Itai Boneh, and Eitan Kondratovsky. The k-Mappability Problem Revisited. In 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 191, pp. 5:1-5:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{amir_et_al:LIPIcs.CPM.2021.5,
  author =	{Amir, Amihood and Boneh, Itai and Kondratovsky, Eitan},
  title =	{{The k-Mappability Problem Revisited}},
  booktitle =	{32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)},
  pages =	{5:1--5:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-186-3},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{191},
  editor =	{Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2021.5},
  URN =		{urn:nbn:de:0030-drops-139566},
  doi =		{10.4230/LIPIcs.CPM.2021.5},
  annote =	{Keywords: Pattern Matching, Hamming Distance, Suffix Tree, Suffix Array}
}
Document
Internal Shortest Absent Word Queries

Authors: Golnaz Badkobeh, Panagiotis Charalampopoulos, and Solon P. Pissis

Published in: LIPIcs, Volume 191, 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)


Abstract
Given a string T of length n over an alphabet Σ ⊂ {1,2,…,n^{𝒪(1)}} of size σ, we are to preprocess T so that given a range [i,j], we can return a representation of a shortest string over Σ that is absent in the fragment T[i]⋯ T[j] of T. For any positive integer k ∈ [1,log log_σ n], we present an 𝒪((n/k)⋅ log log_σ n)-size data structure, which can be constructed in 𝒪(nlog_σ n) time, and answers queries in time 𝒪(log log_σ k).

Cite as

Golnaz Badkobeh, Panagiotis Charalampopoulos, and Solon P. Pissis. Internal Shortest Absent Word Queries. In 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 191, pp. 6:1-6:18, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{badkobeh_et_al:LIPIcs.CPM.2021.6,
  author =	{Badkobeh, Golnaz and Charalampopoulos, Panagiotis and Pissis, Solon P.},
  title =	{{Internal Shortest Absent Word Queries}},
  booktitle =	{32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)},
  pages =	{6:1--6:18},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-186-3},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{191},
  editor =	{Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2021.6},
  URN =		{urn:nbn:de:0030-drops-139570},
  doi =		{10.4230/LIPIcs.CPM.2021.6},
  annote =	{Keywords: string algorithms, internal queries, shortest absent word, bit parallelism}
}
Document
Constructing the Bijective and the Extended Burrows-Wheeler Transform in Linear Time

Authors: Hideo Bannai, Juha Kärkkäinen, Dominik Köppl, and Marcin Piątkowski

Published in: LIPIcs, Volume 191, 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)


Abstract
The Burrows-Wheeler transform (BWT) is a permutation whose applications are prevalent in data compression and text indexing. The bijective BWT (BBWT) is a bijective variant of it. Although it is known that the BWT can be constructed in linear time for integer alphabets by using a linear time suffix array construction algorithm, it was up to now only conjectured that the BBWT can also be constructed in linear time. We confirm this conjecture in the word RAM model by proposing a construction algorithm that is based on SAIS, improving the best known result of O(n lg n / lg lg n) time to linear. Since we can reduce the problem of constructing the extended BWT to constructing the BBWT in linear time, we obtain a linear-time algorithm computing the extended BWT at the same time.

Cite as

Hideo Bannai, Juha Kärkkäinen, Dominik Köppl, and Marcin Piątkowski. Constructing the Bijective and the Extended Burrows-Wheeler Transform in Linear Time. In 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 191, pp. 7:1-7:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{bannai_et_al:LIPIcs.CPM.2021.7,
  author =	{Bannai, Hideo and K\"{a}rkk\"{a}inen, Juha and K\"{o}ppl, Dominik and Pi\k{a}tkowski, Marcin},
  title =	{{Constructing the Bijective and the Extended Burrows-Wheeler Transform in Linear Time}},
  booktitle =	{32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)},
  pages =	{7:1--7:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-186-3},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{191},
  editor =	{Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2021.7},
  URN =		{urn:nbn:de:0030-drops-139588},
  doi =		{10.4230/LIPIcs.CPM.2021.7},
  annote =	{Keywords: Burrows-Wheeler Transform, Lyndon words, Circular Suffix Array, Suffix Array Construction Algorithm}
}
Document
Weighted Ancestors in Suffix Trees Revisited

Authors: Djamal Belazzougui, Dmitry Kosolobov, Simon J. Puglisi, and Rajeev Raman

Published in: LIPIcs, Volume 191, 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)


Abstract
The weighted ancestor problem is a well-known generalization of the predecessor problem to trees. It is known to require Ω(log log n) time for queries provided 𝒪(n polylog n) space is available and weights are from [0..n], where n is the number of tree nodes. However, when applied to suffix trees, the problem, surprisingly, admits an 𝒪(n)-space solution with constant query time, as was shown by Gawrychowski, Lewenstein, and Nicholson (Proc. ESA 2014). This variant of the problem can be reformulated as follows: given the suffix tree of a string s, we need a data structure that can locate in the tree any substring s[p..q] of s in 𝒪(1) time (as if one descended from the root reading s[p..q] along the way). Unfortunately, the data structure of Gawrychowski et al. has no efficient construction algorithm, limiting its wider usage as an algorithmic tool. In this paper we resolve this issue, describing a data structure for weighted ancestors in suffix trees with constant query time and a linear construction algorithm. Our solution is based on a novel approach using so-called irreducible LCP values.

Cite as

Djamal Belazzougui, Dmitry Kosolobov, Simon J. Puglisi, and Rajeev Raman. Weighted Ancestors in Suffix Trees Revisited. In 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 191, pp. 8:1-8:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{belazzougui_et_al:LIPIcs.CPM.2021.8,
  author =	{Belazzougui, Djamal and Kosolobov, Dmitry and Puglisi, Simon J. and Raman, Rajeev},
  title =	{{Weighted Ancestors in Suffix Trees Revisited}},
  booktitle =	{32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)},
  pages =	{8:1--8:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-186-3},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{191},
  editor =	{Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2021.8},
  URN =		{urn:nbn:de:0030-drops-139594},
  doi =		{10.4230/LIPIcs.CPM.2021.8},
  annote =	{Keywords: suffix tree, weighted ancestors, irreducible LCP, deterministic substring hashing}
}
  • Refine by Author
  • 18 Starikovskaya, Tatiana
  • 6 Gawrychowski, Paweł
  • 3 Pissis, Solon P.
  • 3 Radoszewski, Jakub
  • 2 Bannai, Hideo
  • Show More...

  • Refine by Classification
  • 23 Theory of computation → Pattern matching
  • 4 Mathematics of computing → Combinatorics on words
  • 4 Theory of computation → Data compression
  • 4 Theory of computation → Streaming, sublinear and near linear time algorithms
  • 3 Theory of computation → Data structures design and analysis
  • Show More...

  • Refine by Keyword
  • 5 streaming algorithms
  • 4 Hamming distance
  • 4 pattern matching
  • 3 Streaming algorithms
  • 3 Suffix Array
  • Show More...

  • Refine by Type
  • 43 document
  • 1 volume

  • Refine by Publication Year
  • 30 2021
  • 4 2020
  • 3 2019
  • 2 2016
  • 2 2018
  • Show More...

Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail