The rapid evolution of learned data structures has revolutionized database indexing, particularly for sorted integer datasets. While learned indexes excel in static scenarios due to their low memory footprint, reduced storage requirements, and fast lookup times, benchmarks like SOSD and TLI have largely overlooked compressed indexes and SIMD-based implementations of traditional indexes. This paper addresses this gap by introducing a comprehensive benchmarking framework that (i) evaluates traditional, learned, and compressed indexes across 12 datasets (real and synthetic) of varying types and sizes; (ii) integrates state-of-the-art SIMD-enhanced B-Tree variants; and (iii) measures critical performance metrics such as memory usage, construction time, and lookup efficiency. Our findings reveal that while learned indexes minimize memory usage, a feature useful when internal memory constraints are mandatory, SIMD-enhanced B-Trees consistently achieve superior lookup times with comparable extra space. On the other hand, compressed indexes like LA-vector and EliasFano provide very effective compression of the indexed data with slower access speeds (2x-3x). Another contribution of this paper is a publicly available benchmarking framework (composed of code and datasets) that makes our experiments reproducible and extensible to other indexes and datasets.
@InProceedings{bellomo_et_al:LIPIcs.SEA.2025.5, author = {Bellomo, Lorenzo and Cianci, Giuseppe and de Rosa, Luca and Ferragina, Paolo and Odorisio, Mattia}, title = {{A Comparative Study of Compressed, Learned, and Traditional Indexing Methods for Integer Data}}, booktitle = {23rd International Symposium on Experimental Algorithms (SEA 2025)}, pages = {5:1--5:23}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-375-1}, ISSN = {1868-8969}, year = {2025}, volume = {338}, editor = {Mutzel, Petra and Prezza, Nicola}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2025.5}, URN = {urn:nbn:de:0030-drops-232439}, doi = {10.4230/LIPIcs.SEA.2025.5}, annote = {Keywords: indexing data structures, compression, algorithm engineering, benchmark} }
Feedback for Dagstuhl Publishing