2 Search Results for "Almodaresi, Fatemeh"


Document
Fast Pseudoalignment Queries on Compressed Colored de Bruijn Graphs

Authors: Alessio Campanelli, Giulio Ermanno Pibiri, and Rob Patro

Published in: LIPIcs, Volume 344, 25th International Conference on Algorithms for Bioinformatics (WABI 2025)


Abstract
Motivation. Indexes for the colored de Bruijn graph (c-dBG) play a crucial role in computational biology by facilitating complex tasks such as read mapping and assembly. These indexes map k-mers (substrings of length k) appearing in a large collection of reference strings to the set of identifiers of the strings where they appear. These sets, colloquially referred to as color sets, tend to occupy large quantities of memory, especially for large pangenomes. Our previous work thus focused on leveraging the repetitiveness of the color sets to improve the space effectiveness of the resulting index. As a matter of fact, repetition-aware indexes can be up to one order of magnitude smaller on large pangenomes compared to indexes that do not exploit such repetitiveness. Such improved space effectiveness, on the other hand, imposes an overhead at query time when performing tasks such as pseudoalignment that require the collection and processing of multiple related color sets. Methods. In this paper, we show how to avoid this overhead. We devise novel query algorithms tailored for the specific repetition-aware representations adopted by the Fulgor index, a state-of-the-art c-dBG index, to significantly improve its pseudoalignment efficiency and without consuming additional space. Results. Our results indicate that with increasing redundancy in the pangenomes, the compression factor provided by the Fulgor index increases, while the relative query time actually reduces. For example, while the space of the Fulgor index improves by 2.5× with repetition-aware compression and its query time improves by 1.6× on a collection of 5,000 Salmonella Enterica genomes, these factors become (6.1×,2.8×) and (11.2×,3.2×) for 50,000 and 150,000 genomes respectively. For an even larger collection of 300,000 genomes, we obtained an index that is 22.3× smaller and 2.2× faster.

Cite as

Alessio Campanelli, Giulio Ermanno Pibiri, and Rob Patro. Fast Pseudoalignment Queries on Compressed Colored de Bruijn Graphs. In 25th International Conference on Algorithms for Bioinformatics (WABI 2025). Leibniz International Proceedings in Informatics (LIPIcs), Volume 344, pp. 6:1-6:21, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2025)


Copy BibTex To Clipboard

@InProceedings{campanelli_et_al:LIPIcs.WABI.2025.6,
  author =	{Campanelli, Alessio and Pibiri, Giulio Ermanno and Patro, Rob},
  title =	{{Fast Pseudoalignment Queries on Compressed Colored de Bruijn Graphs}},
  booktitle =	{25th International Conference on Algorithms for Bioinformatics (WABI 2025)},
  pages =	{6:1--6:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-386-7},
  ISSN =	{1868-8969},
  year =	{2025},
  volume =	{344},
  editor =	{Brejov\'{a}, Bro\v{n}a and Patro, Rob},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2025.6},
  URN =		{urn:nbn:de:0030-drops-239327},
  doi =		{10.4230/LIPIcs.WABI.2025.6},
  annote =	{Keywords: Colored de Bruijn graphs, Pseudoalignment, Repetition-aware compression}
}
Document
Rainbowfish: A Succinct Colored de Bruijn Graph Representation

Authors: Fatemeh Almodaresi, Prashant Pandey, and Rob Patro

Published in: LIPIcs, Volume 88, 17th International Workshop on Algorithms in Bioinformatics (WABI 2017)


Abstract
The colored de Bruijn graph— a variant of the de Bruijn graph which associates each edge (i.e., k-mer) with some set of colors - is an increasingly important combinatorial structure in computational biology. Iqbal et al. demonstrated the utility of this structure for representing and assembling a collection (population) of genomes, and showed how it can be used to accurately detect genetic variants. Muggli et al. introduced VARI, a representation of the colored de Bruijn graph that adopts the BOSS representation for the de Bruijn graph topology and achieves considerable savings in space over Cortex, albeit with some sacrifice in speed. The memory-efficient representation of VARI allows the colored de Bruijn graph to be constructed and analyzed for large datasets, beyond what is possible with Cortex. In this paper, we introduce Rainbowfish, a succinct representation of the color information of the colored de Bruijn graph that reduces the space usage even further. Our representation also uses BOSS to represent the de Bruijn graph, but decomposes the color sets based on an equivalence relation and exploits the inherent skewness in the distribution of these color sets. The Rainbowfish representation is compressed based on the 0th-order entropy of the color sets, which can lead to a significant reduction in the space required to store the relevant information for each edge. In practice, Rainbowfish achieves up to a 20x improvement in space over VARI. Rainbowfish is written in C++11 and is available at https://github.com/COMBINE-lab/rainbowfish.

Cite as

Fatemeh Almodaresi, Prashant Pandey, and Rob Patro. Rainbowfish: A Succinct Colored de Bruijn Graph Representation. In 17th International Workshop on Algorithms in Bioinformatics (WABI 2017). Leibniz International Proceedings in Informatics (LIPIcs), Volume 88, pp. 18:1-18:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{almodaresi_et_al:LIPIcs.WABI.2017.18,
  author =	{Almodaresi, Fatemeh and Pandey, Prashant and Patro, Rob},
  title =	{{Rainbowfish: A Succinct Colored de Bruijn Graph Representation}},
  booktitle =	{17th International Workshop on Algorithms in Bioinformatics (WABI 2017)},
  pages =	{18:1--18:15},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-050-7},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{88},
  editor =	{Schwartz, Russell and Reinert, Knut},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.WABI.2017.18},
  URN =		{urn:nbn:de:0030-drops-76576},
  doi =		{10.4230/LIPIcs.WABI.2017.18},
  annote =	{Keywords: de Bruijn graph, succinct data structures, rank and select operation, colored de Bruijn graph}
}
  • Refine by Type
  • 2 Document/PDF
  • 1 Document/HTML

  • Refine by Publication Year
  • 1 2025
  • 1 2017

  • Refine by Author
  • 2 Patro, Rob
  • 1 Almodaresi, Fatemeh
  • 1 Campanelli, Alessio
  • 1 Pandey, Prashant
  • 1 Pibiri, Giulio Ermanno

  • Refine by Series/Journal
  • 2 LIPIcs

  • Refine by Classification
  • 1 Applied computing → Bioinformatics

  • Refine by Keyword
  • 1 Colored de Bruijn graphs
  • 1 Pseudoalignment
  • 1 Repetition-aware compression
  • 1 colored de Bruijn graph
  • 1 de Bruijn graph
  • Show More...

Any Issues?
X

Feedback on the Current Page

CAPTCHA

Thanks for your feedback!

Feedback submitted to Dagstuhl Publishing

Could not send message

Please try again later or send an E-mail