,
Taha El Ghazi
,
Jonas Ellert
,
Paweł Gawrychowski
,
Tatiana Starikovskaya
Creative Commons Attribution 4.0 International license
Many string processing problems can be phrased in the streaming setting, where the input arrives symbol by symbol and we have sublinear working space. The area of streaming algorithms for string processing has flourished since the seminal work of Porat and Porat [FOCS 2009].
Unfortunately, problems with efficient solutions in the classical setting often do not admit efficient solutions in the streaming setting. As a bridge between these two settings, Saks and Seshadhri [SODA 2013] introduced the asymmetric streaming model (see also [Andoni, Krauthgamer, and Onak; FOCS 2010]). Here, one is given read-only access to a (typically short) reference string R of length m, while a (typically long) text T arrives as a stream.
We provide a generic technique to reduce fundamental string problems in the asymmetric streaming model to the online read-only model, lifting several existing algorithms and generally improving upon the state of the art. Most notably, we obtain asymmetric streaming algorithms for exact and approximate pattern matching (under both the Hamming and edit distances), and for relative Lempel-Ziv compression, a popular scheme for measuring and exploiting redundancy in repetitive text collections.
At the heart of our approach lies a novel tool that facilitates efficient computation in the asymmetric streaming model: the suffix random access data structure. In its simplest variant, it maintains constant-time random access to the longest suffix of (the seen prefix of) T that occurs in R. Let τ be a parameter that denotes the size of the data structure. A straightforward approach maintains the data structure in {O}(m/τ) time per arriving symbol of T.
We drastically improve this tradeoff and reveal fundamental barriers via a bidirectional reduction between suffix random access and function inversion, a central problem in cryptography:
- By leveraging Fiat and Naor’s function inversion data structure [SIAM J. Comput. 2000], we achieve Õ(1+m³/τ⁶) update time. In particular, for τ = √m, we obtain Õ(1) update time, improving over the Ω(√m) bound of the straightforward solution.
- We establish an unconditional Ω̃(m/τ³) lower bound on the update time. Additionally, we show that achieving update time o(m³/τ⁷) would imply a breakthrough in function inversion.
On the way to our upper bound, we propose a variant of the string synchronizing sets ([Kempa and Kociumaka; STOC 2019]) with a local sparsity condition that, as we show, admits an efficient streaming construction algorithm. We believe that our framework and techniques will find broad applications in the development of small-space string algorithms.
@InProceedings{charalampopoulos_et_al:LIPIcs.ICALP.2026.55,
author = {Charalampopoulos, Panagiotis and El Ghazi, Taha and Ellert, Jonas and Gawrychowski, Pawe{\l} and Starikovskaya, Tatiana},
title = {{Suffix Random Access via Function Inversion: A Key for Asymmetric Streaming String Algorithms}},
booktitle = {53rd International Colloquium on Automata, Languages, and Programming (ICALP 2026)},
pages = {55:1--55:20},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-428-4},
ISSN = {1868-8969},
year = {2026},
volume = {374},
editor = {Bhattacharya, Sayan and Nanongkai, Danupon and Benedikt, Michael and Puppis, Gabriele},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ICALP.2026.55},
URN = {urn:nbn:de:0030-drops-264440},
doi = {10.4230/LIPIcs.ICALP.2026.55},
annote = {Keywords: streaming algorithms, function inversion, string algorithms}
}