7 Search Results for "Retha, Ahmad"


Document
Research
On the Construction of Elastic Degenerate Strings

Authors: Nicola Rizzo, Veli Mäkinen, and Nadia Pisanti

Published in: OASIcs, Volume 132, From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday (2025)


Abstract
An elastic degenerate string (EDS) is a sequence of sets of strings. In the context of bioinformatics, EDSes can be used to represent the variations observed in a population from its consensus genome. Pattern matching and comparison problems on EDSes have been widely studied in the literature, but their construction has been largely omitted. We fill this gap by showing how algorithms originally developed for related problems of founder reconstruction can be adapted to minimize the total cardinality of the EDS sets and total length of the EDS strings in linear time, given suitable multiple alignments representing the input data.

Cite as

Nicola Rizzo, Veli Mäkinen, and Nadia Pisanti. On the Construction of Elastic Degenerate Strings. In From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday. Open Access Series in Informatics (OASIcs), Volume 132, pp. 2:1-2:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{rizzo_et_al:OASIcs.Grossi.2,
  author =	{Rizzo, Nicola and M\"{a}kinen, Veli and Pisanti, Nadia},
  title =	{{On the Construction of Elastic Degenerate Strings}},
  booktitle =	{From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday},
  pages =	{2:1--2:13},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-391-1},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{132},
  editor =	{Conte, Alessio and Marino, Andrea and Rosone, Giovanna and Vitter, Jeffrey Scott},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Grossi.2},
  URN =		{urn:nbn:de:0030-drops-238014},
  doi =		{10.4230/OASIcs.Grossi.2},
  annote =	{Keywords: multiple sequence alignment, pattern matching, data structures, segmentation algorithms, founder reconstruction, dynamic programming, semi-dynamic range minimum queries, positional Burrows-Wheeler transform}
}
Document
Circular Dictionary Matching Using Extended BWT

Authors: Wing-Kai Hon, Rahul Shah, and Sharma V. Thankachan

Published in: OASIcs, Volume 131, The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday (2025)


Abstract
The dictionary matching problem involves preprocessing a set of strings (patterns) into a data structure that efficiently identifies all occurrences of these patterns within a query string (text). In this work, we investigate a variation of this problem, termed circular dictionary matching, where the patterns are circular, meaning their cyclic shifts are also considered valid patterns. Such patterns naturally occur in areas such as bioinformatics and computational geometry. Based on the extended Burrows-Wheeler Transformation (eBWT), we design a space-efficient solution for this problem. Specifically, we show that a dictionary of d circular patterns of total length n can be indexed in nlog σ + O(n+dlog n+σ log n) bits of space and support circular dictionary matching on a query text T in O((|T|+occ)log n) time, where σ represents the size of the underlying alphabet and occ represents the output size.

Cite as

Wing-Kai Hon, Rahul Shah, and Sharma V. Thankachan. Circular Dictionary Matching Using Extended BWT. In The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday. Open Access Series in Informatics (OASIcs), Volume 131, pp. 11:1-11:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{hon_et_al:OASIcs.Manzini.11,
  author =	{Hon, Wing-Kai and Shah, Rahul and Thankachan, Sharma V.},
  title =	{{Circular Dictionary Matching Using Extended BWT}},
  booktitle =	{The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday},
  pages =	{11:1--11:14},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-390-4},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{131},
  editor =	{Ferragina, Paolo and Gagie, Travis and Navarro, Gonzalo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Manzini.11},
  URN =		{urn:nbn:de:0030-drops-239195},
  doi =		{10.4230/OASIcs.Manzini.11},
  annote =	{Keywords: String algorithms, Burrows-Wheeler transformation, suffix trees, succinct data structures}
}
Document
Graph Indexing Beyond Wheeler Graphs

Authors: Jarno N. Alanko, Elena Biagi, Massimo Equi, Veli Mäkinen, Simon J. Puglisi, Nicola Rizzo, Kunihiko Sadakane, and Jouni Sirén

Published in: OASIcs, Volume 131, The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday (2025)


Abstract
After the discovery of the FM index, which linked the Burrows-Wheeler transform (BWT) to pattern matching on strings, several contemporaneous strands of research began on indexing more complex structures with the BWT, such as tries, finite languages, de Bruijn graphs, and aligned sequences. These directions can now be viewed as culminating in the theory of Wheeler Graphs, but sometimes they go beyond. This chapter reviews the significant body of "proto Wheeler Graph" indexes, many of which exploit characteristics of their specific case to outperform Wheeler graphs, especially in practice.

Cite as

Jarno N. Alanko, Elena Biagi, Massimo Equi, Veli Mäkinen, Simon J. Puglisi, Nicola Rizzo, Kunihiko Sadakane, and Jouni Sirén. Graph Indexing Beyond Wheeler Graphs. In The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday. Open Access Series in Informatics (OASIcs), Volume 131, pp. 13:1-13:29, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{alanko_et_al:OASIcs.Manzini.13,
  author =	{Alanko, Jarno N. and Biagi, Elena and Equi, Massimo and M\"{a}kinen, Veli and Puglisi, Simon J. and Rizzo, Nicola and Sadakane, Kunihiko and Sir\'{e}n, Jouni},
  title =	{{Graph Indexing Beyond Wheeler Graphs}},
  booktitle =	{The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday},
  pages =	{13:1--13:29},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-390-4},
  ISSN =	{2190-6807},
  year =	{2025},
  volume =	{131},
  editor =	{Ferragina, Paolo and Gagie, Travis and Navarro, Gonzalo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Manzini.13},
  URN =		{urn:nbn:de:0030-drops-239215},
  doi =		{10.4230/OASIcs.Manzini.13},
  annote =	{Keywords: indexing, compression, compressed data structures, string algorithms, pattern matching}
}
Document
Faster Approximate Elastic-Degenerate String Matching - Part B

Authors: Paweł Gawrychowski, Adam Górkiewicz, Pola Marciniak, Solon P. Pissis, and Karol Pokorski

Published in: LIPIcs, Volume 331, 36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025)


Abstract
We revisit the complexity of approximate pattern matching in an elastic-degenerate string. Such a string is a sequence of n finite sets of strings of total length N, and compactly describes a collection of strings obtained by first choosing exactly one string in every set, and then concatenating them together. This is motivated by the need of storing a collection of highly similar DNA sequences. The basic algorithmic question on elastic-degenerate strings is pattern matching: given such an elastic-degenerate string and a standard pattern of length m, check if the pattern occurs in one of the strings in the described collection. Bernardini et al. [SICOMP 2022] showed how to leverage fast matrix multiplication to obtain an Õ(nm^{ω-1})+𝒪(N)-time complexity for this problem, where ω is the matrix multiplication exponent. However, from the point of view of possible applications, it is more desirable to work with approximate pattern matching, where we seek approximate occurrences of the pattern. This generalization has been considered in a few papers already, but the best result so far for occurrences with k mismatches, where k is a constant, is the Õ(nm²+N)-time algorithm presented in Part A [CPM 2025]. This brings the question whether increasing the dependency on m from m^{ω-1} to quadratic is necessary when moving from k = 0 to larger (but still constant) k. We design an Õ(nm^{1.5}+N)-time algorithm for pattern matching with k mismatches in an elastic-degenerate string, for any constant k. To obtain this time bound, we leverage the structural characterization of occurrences with k mismatches of Charalampopoulos, Kociumaka, and Wellnitz [FOCS 2020] together with fast Fourier transform. We need to work with multiple patterns at the same time, instead of a single pattern, which requires refining the original characterization. This might be of independent interest.

Cite as

Paweł Gawrychowski, Adam Górkiewicz, Pola Marciniak, Solon P. Pissis, and Karol Pokorski. Faster Approximate Elastic-Degenerate String Matching - Part B. In 36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 331, pp. 29:1-29:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{gawrychowski_et_al:LIPIcs.CPM.2025.29,
  author =	{Gawrychowski, Pawe{\l} and G\'{o}rkiewicz, Adam and Marciniak, Pola and Pissis, Solon P. and Pokorski, Karol},
  title =	{{Faster Approximate Elastic-Degenerate String Matching - Part B}},
  booktitle =	{36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025)},
  pages =	{29:1--29:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-369-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{331},
  editor =	{Bonizzoni, Paola and M\"{a}kinen, Veli},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2025.29},
  URN =		{urn:nbn:de:0030-drops-231236},
  doi =		{10.4230/LIPIcs.CPM.2025.29},
  annote =	{Keywords: ED string, approximate pattern matching, Hamming distance, k mismatches}
}
Document
Faster Approximate Elastic-Degenerate String Matching - Part A

Authors: Solon P. Pissis, Jakub Radoszewski, and Wiktor Zuba

Published in: LIPIcs, Volume 331, 36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025)


Abstract
An elastic-degenerate (ED) string 𝐓 is a sequence 𝐓 = 𝐓[1] ⋯ 𝐓[n] of n finite sets of strings. The cardinality m of 𝐓 is the total number of strings in 𝐓[i], for all i ∈ [1..n]. The size N of 𝐓 is the total length of all m strings of 𝐓. ED strings have been introduced to represent a set of closely-related DNA sequences. Let P = P[1..p] be a pattern of length p and k > 0 be an integer. We consider the problem of k-Approximate ED String Matching (EDSM): searching k-approximate occurrences of P in the language of 𝐓. We call k-Approximate EDSM under the Hamming distance, k-Mismatch EDSM; and we call k-Approximate EDSM under edit distance, k-Edit EDSM. Bernardini et al. (Theoretical Computer Science, 2020) showed a simple 𝒪(k m p + kN)-time algorithm for k-Mismatch EDSM and an 𝒪(k² m p + kN)-time algorithm for k-Edit EDSM. We improve the dependency on k in both results, obtaining an Õ(k^{2/3}mp+√kN)-time algorithm for k-Mismatch EDSM and an Õ(kmp+ kN)-time algorithm for k-Edit EDSM. Bernardini et al. (Theory of Computing Systems, 2024) presented several algorithms for 1-Approximate EDSM working in Õ(np²+N) time. They have also left the possibility to generalize these solutions for k > 1 as an open problem. We improve the runtime of their solution for 1-Mismatch and 1-Edit EDSM from Õ(np²+N) to 𝒪(np²+N). We further show algorithms for k-Approximate EDSM for the Hamming and edit distances working in Õ(np² + N) time, for any constant k > 0. Finally, we show how our techniques can be applied to improve upon the complexity of the k-Approximate ED String Intersection and k-Approximate Doubly EDSM problems that were introduced very recently by Gabory et al. (Information and Computation, 2025).

Cite as

Solon P. Pissis, Jakub Radoszewski, and Wiktor Zuba. Faster Approximate Elastic-Degenerate String Matching - Part A. In 36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 331, pp. 28:1-28:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{pissis_et_al:LIPIcs.CPM.2025.28,
  author =	{Pissis, Solon P. and Radoszewski, Jakub and Zuba, Wiktor},
  title =	{{Faster Approximate Elastic-Degenerate String Matching - Part A}},
  booktitle =	{36th Annual Symposium on Combinatorial Pattern Matching (CPM 2025)},
  pages =	{28:1--28:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-369-0},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{331},
  editor =	{Bonizzoni, Paola and M\"{a}kinen, Veli},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2025.28},
  URN =		{urn:nbn:de:0030-drops-231227},
  doi =		{10.4230/LIPIcs.CPM.2025.28},
  annote =	{Keywords: ED string, approximate string matching, Hamming distance, edit distance}
}
Document
Dictionary Matching in Elastic-Degenerate Texts with Applications in Searching VCF Files On-line

Authors: Solon P. Pissis and Ahmad Retha

Published in: LIPIcs, Volume 103, 17th International Symposium on Experimental Algorithms (SEA 2018)


Abstract
An elastic-degenerate string is a sequence of n sets of strings of total length N. It has been introduced to represent multiple sequence alignments of closely-related sequences in a compact form. For a standard pattern of length m, pattern matching in an elastic-degenerate text can be solved on-line in time O(nm^2+N) with pre-processing time and space O(m) (Grossi et al., CPM 2017). A fast bit-vector algorithm requiring time O(N * ceil[m/w]) with pre-processing time and space O(m * ceil[m/w]), where w is the size of the computer word, was also presented. In this paper we consider the same problem for a set of patterns of total length M. A straightforward generalization of the existing bit-vector algorithm would require time O(N * ceil[M/w]) with pre-processing time and space O(M * ceil[M/w]), which is prohibitive in practice. We present a new on-line O(N * ceil[M/w])-time algorithm with pre-processing time and space O(M). We present experimental results using both synthetic and real data demonstrating the performance of the algorithm. We further demonstrate a real application of our algorithm in a pipeline for discovery and verification of minimal absent words (MAWs) in the human genome showing that a significant number of previously discovered MAWs are in fact false-positives when a population's variants are considered.

Cite as

Solon P. Pissis and Ahmad Retha. Dictionary Matching in Elastic-Degenerate Texts with Applications in Searching VCF Files On-line. In 17th International Symposium on Experimental Algorithms (SEA 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 103, pp. 16:1-16:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{pissis_et_al:LIPIcs.SEA.2018.16,
  author =	{Pissis, Solon P. and Retha, Ahmad},
  title =	{{Dictionary Matching in Elastic-Degenerate Texts with Applications in Searching VCF Files On-line}},
  booktitle =	{17th International Symposium on Experimental Algorithms (SEA 2018)},
  pages =	{16:1--16:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-070-5},
  ISSN =	{1868-8969},
  year =	{2018},
  volume =	{103},
  editor =	{D'Angelo, Gianlorenzo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2018.16},
  URN =		{urn:nbn:de:0030-drops-89515},
  doi =		{10.4230/LIPIcs.SEA.2018.16},
  annote =	{Keywords: on-line algorithms, algorithms on strings, dictionary matching, elastic-degenerate string, Variant Call Format}
}
Document
On-Line Pattern Matching on Similar Texts

Authors: Roberto Grossi, Costas S. Iliopoulos, Chang Liu, Nadia Pisanti, Solon P. Pissis, Ahmad Retha, Giovanna Rosone, Fatima Vayani, and Luca Versari

Published in: LIPIcs, Volume 78, 28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017)


Abstract
Pattern matching on a set of similar texts has received much attention, especially recently, mainly due to its application in cataloguing human genetic variation. In particular, many different algorithms have been proposed for the off-line version of this problem; that is, constructing a compressed index for a set of similar texts in order to answer pattern matching queries efficiently. However, the on-line, more fundamental, version of this problem is a rather undeveloped topic. Solutions to the on-line version can be beneficial for a number of reasons; for instance, efficient on-line solutions can be used in combination with partial indexes as practical trade-offs. We make here an attempt to close this gap via proposing two efficient algorithms for this problem. Notably, one of the algorithms requires time linear in the size of the texts' representation, for short patterns. Furthermore, experimental results confirm our theoretical findings in practical terms.

Cite as

Roberto Grossi, Costas S. Iliopoulos, Chang Liu, Nadia Pisanti, Solon P. Pissis, Ahmad Retha, Giovanna Rosone, Fatima Vayani, and Luca Versari. On-Line Pattern Matching on Similar Texts. In 28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 78, pp. 9:1-9:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{grossi_et_al:LIPIcs.CPM.2017.9,
  author =	{Grossi, Roberto and Iliopoulos, Costas S. and Liu, Chang and Pisanti, Nadia and Pissis, Solon P. and Retha, Ahmad and Rosone, Giovanna and Vayani, Fatima and Versari, Luca},
  title =	{{On-Line Pattern Matching on Similar Texts}},
  booktitle =	{28th Annual Symposium on Combinatorial Pattern Matching (CPM 2017)},
  pages =	{9:1--9:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-039-2},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{78},
  editor =	{K\"{a}rkk\"{a}inen, Juha and Radoszewski, Jakub and Rytter, Wojciech},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2017.9},
  URN =		{urn:nbn:de:0030-drops-73379},
  doi =		{10.4230/LIPIcs.CPM.2017.9},
  annote =	{Keywords: string algorithms, pattern matching, degenerate strings, elastic-degenerate strings, on-line algorithms}
}
  • Refine by Type
  • 7 Document/PDF
  • 5 Document/HTML

  • Refine by Publication Year
  • 5 2025
  • 1 2018
  • 1 2017

  • Refine by Author
  • 4 Pissis, Solon P.
  • 2 Mäkinen, Veli
  • 2 Pisanti, Nadia
  • 2 Retha, Ahmad
  • 2 Rizzo, Nicola
  • Show More...

  • Refine by Series/Journal
  • 4 LIPIcs
  • 3 OASIcs

  • Refine by Classification
  • 6 Theory of computation → Pattern matching
  • 2 Theory of computation → Sorting and searching
  • 1 Applied computing → Genomics
  • 1 Theory of computation → Dynamic programming
  • 1 Theory of computation → Formal languages and automata theory
  • Show More...

  • Refine by Keyword
  • 3 pattern matching
  • 2 ED string
  • 2 Hamming distance
  • 2 on-line algorithms
  • 2 string algorithms
  • Show More...

Any Issues?
X

Feedback on the Current Page

CAPTCHA

Thanks for your feedback!

Feedback submitted to Dagstuhl Publishing

Could not send message

Please try again later or send an E-mail