Search Results

Documents authored by Bhattacharya, Sudatta


Document
Many Flavors of Edit Distance

Authors: Sudatta Bhattacharya, Sanjana Dey, Elazar Goldenberg, and Michal Koucký

Published in: LIPIcs, Volume 323, 44th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2024)


Abstract
Several measures exist for string similarity, including notable ones like the edit distance and the indel distance. The former measures the count of insertions, deletions, and substitutions required to transform one string into another, while the latter specifically quantifies the number of insertions and deletions. Many algorithmic solutions explicitly address one of these measures, and frequently techniques applicable to one can also be adapted to work with the other. In this paper, we investigate whether there exists a standardized approach for applying results from one setting to another. Specifically, we demonstrate the capability to reduce questions regarding string similarity over arbitrary alphabets to equivalent questions over a binary alphabet. Furthermore, we illustrate how to transform questions concerning indel distance into equivalent questions based on edit distance. This complements an earlier result of Tiskin (2007) which addresses the inverse direction.

Cite as

Sudatta Bhattacharya, Sanjana Dey, Elazar Goldenberg, and Michal Koucký. Many Flavors of Edit Distance. In 44th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 323, pp. 11:1-11:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024)


Copy BibTex To Clipboard

@InProceedings{bhattacharya_et_al:LIPIcs.FSTTCS.2024.11,
  author =	{Bhattacharya, Sudatta and Dey, Sanjana and Goldenberg, Elazar and Kouck\'{y}, Michal},
  title =	{{Many Flavors of Edit Distance}},
  booktitle =	{44th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2024)},
  pages =	{11:1--11:16},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-355-3},
  ISSN =	{1868-8969},
  year =	{2024},
  volume =	{323},
  editor =	{Barman, Siddharth and Lasota, S{\l}awomir},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.FSTTCS.2024.11},
  URN =		{urn:nbn:de:0030-drops-222004},
  doi =		{10.4230/LIPIcs.FSTTCS.2024.11},
  annote =	{Keywords: Edit distance, Indel distance, Embedding, LCS, Alphabet Reduction}
}
Document
Track A: Algorithms, Complexity and Games
Streaming k-Edit Approximate Pattern Matching via String Decomposition

Authors: Sudatta Bhattacharya and Michal Koucký

Published in: LIPIcs, Volume 261, 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)


Abstract
In this paper we give an algorithm for streaming k-edit approximate pattern matching which uses space Õ(k²) and time Õ(k²) per arriving symbol. This improves substantially on the recent algorithm of Kociumaka, Porat and Starikovskaya [Kociumaka et al., 2022] which uses space Õ(k⁵) and time Õ(k⁸) per arriving symbol. In the k-edit approximate pattern matching problem we get a pattern P and text T and we want to identify all substrings of the text T that are at edit distance at most k from P. In the streaming version of this problem both the pattern and the text arrive in a streaming fashion symbol by symbol and after each symbol of the text we need to report whether there is a current suffix of the text with edit distance at most k from P. We measure the total space needed by the algorithm and time needed per arriving symbol.

Cite as

Sudatta Bhattacharya and Michal Koucký. Streaming k-Edit Approximate Pattern Matching via String Decomposition. In 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023). Leibniz International Proceedings in Informatics (LIPIcs), Volume 261, pp. 22:1-22:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{bhattacharya_et_al:LIPIcs.ICALP.2023.22,
  author =	{Bhattacharya, Sudatta and Kouck\'{y}, Michal},
  title =	{{Streaming k-Edit Approximate Pattern Matching via String Decomposition}},
  booktitle =	{50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)},
  pages =	{22:1--22:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-278-5},
  ISSN =	{1868-8969},
  year =	{2023},
  volume =	{261},
  editor =	{Etessami, Kousha and Feige, Uriel and Puppis, Gabriele},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2023.22},
  URN =		{urn:nbn:de:0030-drops-180741},
  doi =		{10.4230/LIPIcs.ICALP.2023.22},
  annote =	{Keywords: Approximate pattern matching, edit distance, streaming algorithms}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail