,
Alberto Policriti
,
Nicola Prezza
,
Eva Rotenberg
Creative Commons Attribution 3.0 Unported license
String attractors [STOC 2018] are combinatorial objects recently introduced to unify all known dictionary compression techniques in a single theory. A set Gamma subseteq [1..n] is a k-attractor for a string S in Sigma^n if and only if every distinct substring of S of length at most k has an occurrence crossing at least one of the positions in Gamma. Finding the smallest k-attractor is NP-hard for k >= 3, but polylogarithmic approximations can be found using reductions from dictionary compressors. It is easy to reduce the k-attractor problem to a set-cover instance where the string's positions are interpreted as sets of substrings. The main result of this paper is a much more powerful reduction based on the truncated suffix tree. Our new characterization of the problem leads to more efficient algorithms for string attractors: we show how to check the validity and minimality of a k-attractor in near-optimal time and how to quickly compute exact solutions. For example, we prove that a minimum 3-attractor can be found in O(n) time when |Sigma| in O(sqrt[3+epsilon]{log n}) for some constant epsilon > 0, despite the problem being NP-hard for large Sigma.
@InProceedings{kempa_et_al:LIPIcs.ESA.2018.52,
author = {Kempa, Dominik and Policriti, Alberto and Prezza, Nicola and Rotenberg, Eva},
title = {{String Attractors: Verification and Optimization}},
booktitle = {26th Annual European Symposium on Algorithms (ESA 2018)},
pages = {52:1--52:13},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-081-1},
ISSN = {1868-8969},
year = {2018},
volume = {112},
editor = {Azar, Yossi and Bast, Hannah and Herman, Grzegorz},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2018.52},
URN = {urn:nbn:de:0030-drops-95153},
doi = {10.4230/LIPIcs.ESA.2018.52},
annote = {Keywords: Dictionary compression, String attractors, Set cover}
}