We present the algorithm funnelselect, the first optimal randomized cache-oblivious algorithm for the multiple-selection problem. The algorithm takes as input an unsorted array of N elements and q query ranks r_1 < ⋯ < r_q, and returns in sorted order the q input elements of rank r_1, …, r_q, respectively. The algorithm uses expected and with high probability O(∑_{i = 1}^{q+1} Δ_i/B ⋅ log_{M/B} N/(Δ_i) + N/B) I/Os, where B is the external memory block size, M ≥ B^{1+ε} is the internal memory size, for some constant ε > 0, and Δ_i = r_i - r_{i-1} (assuming r_0 = 0 and r_{q+1} = N + 1). This is the best possible I/O bound in the cache-oblivious and external memory models. The result is achieved by reversing the computation of the cache-oblivious sorting algorithm funnelsort by Frigo, Leiserson, Prokop and Ramachandran [FOCS 1999], using randomly selected pivots for distributing elements, and pruning computations that with high probability are not expected to contain any query ranks.
@InProceedings{brodal_et_al:LIPIcs.ESA.2023.25, author = {Brodal, Gerth St{\o}lting and Wild, Sebastian}, title = {{Funnelselect: Cache-Oblivious Multiple Selection}}, booktitle = {31st Annual European Symposium on Algorithms (ESA 2023)}, pages = {25:1--25:17}, series = {Leibniz International Proceedings in Informatics (LIPIcs)}, ISBN = {978-3-95977-295-2}, ISSN = {1868-8969}, year = {2023}, volume = {274}, editor = {G{\o}rtz, Inge Li and Farach-Colton, Martin and Puglisi, Simon J. and Herman, Grzegorz}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.ESA.2023.25}, URN = {urn:nbn:de:0030-drops-186789}, doi = {10.4230/LIPIcs.ESA.2023.25}, annote = {Keywords: Multiple selection, cache-oblivious algorithm, randomized algorithm, entropy bounds} }
Feedback for Dagstuhl Publishing