3 Search Results for "Bhattacharya, Sudatta"


Document
Streaming Periodicity with Mismatches, Wildcards, and Edits

Authors: Taha El Ghazi and Tatiana Starikovskaya

Published in: LIPIcs, Volume 359, 36th International Symposium on Algorithms and Computation (ISAAC 2025)


Abstract
In this work, we study the problem of detecting periodic trends in strings. While detecting exact periodicity has been studied extensively, real-world data is often noisy, where small deviations or mismatches occur between repetitions. This work focuses on a generalized approach to period detection that efficiently handles noise. Given a string S of length n, the task is to identify integers p such that the prefix and the suffix of S, each of length n-p+1, are similar under a given distance measure. Ergün et al. [APPROX-RANDOM 2017] were the first to study this problem in the streaming model under the Hamming distance. In this work, we combine, in a non-trivial way, the Hamming distance sketch of Clifford et al. [SODA 2019] and the structural description of the k-mismatch occurrences of a pattern in a text by Charalampopoulos et al. [FOCS 2020] to present a more efficient streaming algorithm for period detection under the Hamming distance. As a corollary, we derive a streaming algorithm for detecting periods of strings which may contain wildcards, a special symbol that match any character of the alphabet. Our algorithm is not only more efficient than that of Ergün et al. [TCS 2020], but it also operates without their assumption that the string must be free of wildcards in its final characters. Additionally, we introduce the first two-pass streaming algorithm for computing periods under the edit distance by leveraging and extending the Bhattacharya-Koucký’s grammar decomposition technique [STOC 2023].

Cite as

Taha El Ghazi and Tatiana Starikovskaya. Streaming Periodicity with Mismatches, Wildcards, and Edits. In 36th International Symposium on Algorithms and Computation (ISAAC 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 359, pp. 36:1-36:20, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{elghazi_et_al:LIPIcs.ISAAC.2025.36,
  author =	{El Ghazi, Taha and Starikovskaya, Tatiana},
  title =	{{Streaming Periodicity with Mismatches, Wildcards, and Edits}},
  booktitle =	{36th International Symposium on Algorithms and Computation (ISAAC 2025)},
  pages =	{36:1--36:20},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-408-6},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{359},
  editor =	{Chen, Ho-Lin and Hon, Wing-Kai and Tsai, Meng-Tsung},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ISAAC.2025.36},
  URN =		{urn:nbn:de:0030-drops-249446},
  doi =		{10.4230/LIPIcs.ISAAC.2025.36},
  annote =	{Keywords: approximate periods, pattern matching, streaming algorithms}
}
Document
Many Flavors of Edit Distance

Authors: Sudatta Bhattacharya, Sanjana Dey, Elazar Goldenberg, and Michal Koucký

Published in: LIPIcs, Volume 323, 44th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2024)


Abstract
Several measures exist for string similarity, including notable ones like the edit distance and the indel distance. The former measures the count of insertions, deletions, and substitutions required to transform one string into another, while the latter specifically quantifies the number of insertions and deletions. Many algorithmic solutions explicitly address one of these measures, and frequently techniques applicable to one can also be adapted to work with the other. In this paper, we investigate whether there exists a standardized approach for applying results from one setting to another. Specifically, we demonstrate the capability to reduce questions regarding string similarity over arbitrary alphabets to equivalent questions over a binary alphabet. Furthermore, we illustrate how to transform questions concerning indel distance into equivalent questions based on edit distance. This complements an earlier result of Tiskin (2007) which addresses the inverse direction.

Cite as

Sudatta Bhattacharya, Sanjana Dey, Elazar Goldenberg, and Michal Koucký. Many Flavors of Edit Distance. In 44th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 323, pp. 11:1-11:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{bhattacharya_et_al:LIPIcs.FSTTCS.2024.11,
  author =	{Bhattacharya, Sudatta and Dey, Sanjana and Goldenberg, Elazar and Kouck\'{y}, Michal},
  title =	{{Many Flavors of Edit Distance}},
  booktitle =	{44th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2024)},
  pages =	{11:1--11:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-355-3},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{323},
  editor =	{Barman, Siddharth and Lasota, S{\l}awomir},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2024.11},
  URN =		{urn:nbn:de:0030-drops-222004},
  doi =		{10.4230/LIPIcs.FSTTCS.2024.11},
  annote =	{Keywords: Edit distance, Indel distance, Embedding, LCS, Alphabet Reduction}
}
Document
Track A: Algorithms, Complexity and Games
Streaming k-Edit Approximate Pattern Matching via String Decomposition

Authors: Sudatta Bhattacharya and Michal Koucký

Published in: LIPIcs, Volume 261, 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)


Abstract
In this paper we give an algorithm for streaming k-edit approximate pattern matching which uses space Õ(k²) and time Õ(k²) per arriving symbol. This improves substantially on the recent algorithm of Kociumaka, Porat and Starikovskaya [Kociumaka et al., 2022] which uses space Õ(k⁵) and time Õ(k⁸) per arriving symbol. In the k-edit approximate pattern matching problem we get a pattern P and text T and we want to identify all substrings of the text T that are at edit distance at most k from P. In the streaming version of this problem both the pattern and the text arrive in a streaming fashion symbol by symbol and after each symbol of the text we need to report whether there is a current suffix of the text with edit distance at most k from P. We measure the total space needed by the algorithm and time needed per arriving symbol.

Cite as

Sudatta Bhattacharya and Michal Koucký. Streaming k-Edit Approximate Pattern Matching via String Decomposition. In 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 261, pp. 22:1-22:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{bhattacharya_et_al:LIPIcs.ICALP.2023.22,
  author =	{Bhattacharya, Sudatta and Kouck\'{y}, Michal},
  title =	{{Streaming k-Edit Approximate Pattern Matching via String Decomposition}},
  booktitle =	{50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)},
  pages =	{22:1--22:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-278-5},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{261},
  editor =	{Etessami, Kousha and Feige, Uriel and Puppis, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2023.22},
  URN =		{urn:nbn:de:0030-drops-180741},
  doi =		{10.4230/LIPIcs.ICALP.2023.22},
  annote =	{Keywords: Approximate pattern matching, edit distance, streaming algorithms}
}
  • Refine by Type
  • 3 Document/PDF
  • 1 Document/HTML

  • Refine by Publication Year
  • 1 2025
  • 1 2024
  • 1 2023

  • Refine by Author
  • 2 Bhattacharya, Sudatta
  • 2 Koucký, Michal
  • 1 Dey, Sanjana
  • 1 El Ghazi, Taha
  • 1 Goldenberg, Elazar
  • Show More...

  • Refine by Series/Journal
  • 3 LIPIcs

  • Refine by Classification
  • 2 Theory of computation → Pattern matching
  • 1 Theory of computation → Random projections and metric embeddings
  • 1 Theory of computation → Sketching and sampling

  • Refine by Keyword
  • 2 streaming algorithms
  • 1 Alphabet Reduction
  • 1 Approximate pattern matching
  • 1 Edit distance
  • 1 Embedding
  • Show More...

Any Issues?
X

Feedback on the Current Page

CAPTCHA

Thanks for your feedback!

Feedback submitted to Dagstuhl Publishing

Could not send message

Please try again later or send an E-mail