Search Results

Documents authored by Nellore, Abhinav


Document
Arbitrary-Length Analogs to de Bruijn Sequences

Authors: Abhinav Nellore and Rachel Ward

Published in: LIPIcs, Volume 223, 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)


Abstract
Let α̃ be a length-L cyclic sequence of characters from a size-K alphabet 𝒜 such that for every positive integer m ≤ L, the number of occurrences of any length-m string on 𝒜 as a substring of α̃ is ⌊ L / K^m ⌋ or ⌈ L / K^m ⌉. When L = K^N for any positive integer N, α̃ is a de Bruijn sequence of order N, and when L ≠ K^N, α̃ shares many properties with de Bruijn sequences. We describe an algorithm that outputs some α̃ for any combination of K ≥ 2 and L ≥ 1 in O(L) time using O(L log K) space. This algorithm extends Lempel’s recursive construction of a binary de Bruijn sequence. An implementation written in Python is available at https://github.com/nelloreward/pkl.

Cite as

Abhinav Nellore and Rachel Ward. Arbitrary-Length Analogs to de Bruijn Sequences. In 33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022). Leibniz International Proceedings in Informatics (LIPIcs), Volume 223, pp. 9:1-9:20, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{nellore_et_al:LIPIcs.CPM.2022.9,
  author =	{Nellore, Abhinav and Ward, Rachel},
  title =	{{Arbitrary-Length Analogs to de Bruijn Sequences}},
  booktitle =	{33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)},
  pages =	{9:1--9:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-234-1},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{223},
  editor =	{Bannai, Hideo and Holub, Jan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2022.9},
  URN =		{urn:nbn:de:0030-drops-161361},
  doi =		{10.4230/LIPIcs.CPM.2022.9},
  annote =	{Keywords: de Bruijn sequence, de Bruijn word, Lempel’s D-morphism, Lempel’s homomorphism}
}
Document
An Invertible Transform for Efficient String Matching in Labeled Digraphs

Authors: Abhinav Nellore, Austin Nguyen, and Reid F. Thompson

Published in: LIPIcs, Volume 191, 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)


Abstract
Let G = (V, E) be a digraph where each vertex is unlabeled, each edge is labeled by a character in some alphabet Ω, and any two edges with both the same head and the same tail have different labels. The powerset construction gives a transform of G into a weakly connected digraph G' = (V', E') that enables solving the decision problem of whether there is a walk in G matching an arbitrarily long query string q in time linear in |q| and independent of |E| and |V|. We show G is uniquely determined by G' when for every v_𝓁 ∈ V, there is some distinct string s_𝓁 on Ω such that v_𝓁 is the origin of a closed walk in G matching s_𝓁, and no other walk in G matches s_𝓁 unless it starts and ends at v_𝓁. We then exploit this invertibility condition to strategically alter any G so its transform G' enables retrieval of all t terminal vertices of walks in the unaltered G matching q in O(|q| + t log |V|) time. We conclude by proposing two defining properties of a class of transforms that includes the Burrows-Wheeler transform and the transform presented here.

Cite as

Abhinav Nellore, Austin Nguyen, and Reid F. Thompson. An Invertible Transform for Efficient String Matching in Labeled Digraphs. In 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021). Leibniz International Proceedings in Informatics (LIPIcs), Volume 191, pp. 20:1-20:14, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{nellore_et_al:LIPIcs.CPM.2021.20,
  author =	{Nellore, Abhinav and Nguyen, Austin and Thompson, Reid F.},
  title =	{{An Invertible Transform for Efficient String Matching in Labeled Digraphs}},
  booktitle =	{32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021)},
  pages =	{20:1--20:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-186-3},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{191},
  editor =	{Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2021.20},
  URN =		{urn:nbn:de:0030-drops-139717},
  doi =		{10.4230/LIPIcs.CPM.2021.20},
  annote =	{Keywords: pattern matching, string matching, Burrows-Wheeler transform, labeled graphs}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail