,
Filippo Lari
Creative Commons Attribution 4.0 International license
We introduce the data structure variant of the well-known element distinctness problem. Given an array of n elements, the goal is to preprocess the array into a data structure that supports queries asking whether all elements within a given query range are distinct. This has applications in text indexing and possibly also in other algorithmic domains.
In the indexing model (where access to the input array is allowed), we design a data structure using O((n log b)/b) bits and answering queries in the time needed to solve an online element distinctness instance of size O(b), for any b ≥ 1. As a concrete instantiation of this, there exists an index that answers queries in O(log log log n) time using O({n log²(log log log n)}/{log log log n}) bits of additional space.
Moving to the encoding model (where access to the input array is not allowed), we begin by proving an information-theoretic lower bound for the space usage of 2n-O(log n) bits, and then design a matching encoding with O(1) time queries. We then consider the case in which the alphabet size σ is constant. In this setting, the lower bound can be refined to n log(r_σ) - 3 log(σ+2) + O(1) bits, where r_σ = 4cos²(π/(σ+2)). This lower bound is matched by an encoding with O(1) time queries.
@InProceedings{fischer_et_al:LIPIcs.CPM.2026.9,
author = {Fischer, Johannes and Lari, Filippo},
title = {{Indexing and Encoding Arrays for Element Distinctness Queries}},
booktitle = {37th Annual Symposium on Combinatorial Pattern Matching (CPM 2026)},
pages = {9:1--9:17},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {978-3-95977-420-8},
ISSN = {1868-8969},
year = {2026},
volume = {369},
editor = {Bille, Philip and Prezza, Nicola},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.CPM.2026.9},
URN = {urn:nbn:de:0030-drops-259350},
doi = {10.4230/LIPIcs.CPM.2026.9},
annote = {Keywords: element distinctness, range queries, lower bounds, succinct data structures}
}