,
Alessio Conte
,
Veronica Guerrini
,
Giulia Punzi
,
Giovanna Rosone
,
Lorenzo Tattini
Creative Commons Attribution 4.0 International license
Compact indices are a fundamental tool in string analysis, even more so in bioinformatics, where genomic sequences can reach billions in length. This paper presents some recent results in which Roberto Grossi has been involved, showing how some of these indices do more than just efficiently represent data, but rather are able to bring out salient information within it, which can be exploited for their downstream analysis. Specifically, we first review a recently-introduced method [Guerrini et al., 2023] that employs the Burrows-Wheeler Transform to build reasonably accurate phylogenetic trees in an assembly-free scenario. We then describe a recent practical tool [Buzzega et al., 2025] for indexing Maximal Common Subsequences between strings, which can enable analysis of genomic sequence similarity. Experimentally, we show that the results produced by the one index are consistent with the expectations about the results of the other index.
@InProceedings{buzzega_et_al:OASIcs.Grossi.20,
author = {Buzzega, Giovanni and Conte, Alessio and Guerrini, Veronica and Punzi, Giulia and Rosone, Giovanna and Tattini, Lorenzo},
title = {{Subsequence-Based Indices for Genome Sequence Analysis}},
booktitle = {From Strings to Graphs, and Back Again: A Festschrift for Roberto Grossi's 60th Birthday},
pages = {20:1--20:21},
series = {Open Access Series in Informatics (OASIcs)},
ISBN = {978-3-95977-391-1},
ISSN = {2190-6807},
year = {2025},
volume = {132},
editor = {Conte, Alessio and Marino, Andrea and Rosone, Giovanna and Vitter, Jeffrey Scott},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Grossi.20},
URN = {urn:nbn:de:0030-drops-238199},
doi = {10.4230/OASIcs.Grossi.20},
annote = {Keywords: String Indices, Burrows-Wheeler Transform, Maximal Common Subsequences, Sequence Analysis, Phylogeny}
}