,
Rahul Shah
,
Sharma V. Thankachan
Creative Commons Attribution 4.0 International license
The dictionary matching problem involves preprocessing a set of strings (patterns) into a data structure that efficiently identifies all occurrences of these patterns within a query string (text). In this work, we investigate a variation of this problem, termed circular dictionary matching, where the patterns are circular, meaning their cyclic shifts are also considered valid patterns. Such patterns naturally occur in areas such as bioinformatics and computational geometry. Based on the extended Burrows-Wheeler Transformation (eBWT), we design a space-efficient solution for this problem. Specifically, we show that a dictionary of d circular patterns of total length n can be indexed in nlog σ + O(n+dlog n+σ log n) bits of space and support circular dictionary matching on a query text T in O((|T|+occ)log n) time, where σ represents the size of the underlying alphabet and occ represents the output size.
@InProceedings{hon_et_al:OASIcs.Manzini.11,
author = {Hon, Wing-Kai and Shah, Rahul and Thankachan, Sharma V.},
title = {{Circular Dictionary Matching Using Extended BWT}},
booktitle = {The Expanding World of Compressed Data: A Festschrift for Giovanni Manzini's 60th Birthday},
pages = {11:1--11:14},
series = {Open Access Series in Informatics (OASIcs)},
ISBN = {978-3-95977-390-4},
ISSN = {2190-6807},
year = {2025},
volume = {131},
editor = {Ferragina, Paolo and Gagie, Travis and Navarro, Gonzalo},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.Manzini.11},
URN = {urn:nbn:de:0030-drops-239195},
doi = {10.4230/OASIcs.Manzini.11},
annote = {Keywords: String algorithms, Burrows-Wheeler transformation, suffix trees, succinct data structures}
}